-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are there CivicPlus sites that run on non-CivicPlus domains? #82
Comments
While I've spent about an hour checking to see if any other CivicPlus sites with .gov or .org URLs do not correspond to a URL of the form
However, I can't definitively prove that this is always true. A more comprehensive fix would be to have more robust site detection capability (not to be confused with the method discussed in #69). At present, our method of identifying Agenda Center sites involves manually searching an online subdomain enumeration tool. We could develop a way to programmatically identify websites built using CivicPlus's Agenda Center product. More generally, in the future, we may want to automatically detect websites built using other meeting software, e.g., Legistar. The best solution I can think of is to write a script that uses both a Google Custom Search API and subdomain enumerating libraries. The Google API could be used to detect, for example, the first 1,000 or so results for the searches |
@DiPierro Thanks for digging into this! This sounds like good news -- i.e. it appears we can generally assume that CivicPlus sites have a working subdomain. It may be that our initial site discovery methodology which you describe unearthed URLs that are no longer valid, so it may simply be a matter of identifying and updating the canonical URLs for problematic sites in our canonical list of known CivicPlus sites. That list includes a lot of I think we can address this as a mixed task -- part coding and part research. We should be able to easily write a script that steps through all URLs and tests That process should help us figure out if all CivicPlus sites that we're aware of have standard subdomains on CivicPlus and help us decide what, if any, changes are needed to address the "unique name" issue described in #80. @DiPierro Do you want to take on that scripting/research as part of the aw-scripts library? Alternatively, we can flag this as a "help wanted" issue to see if we can find volunteers to take a stab. |
Hi @zstumgoren - would you mind flagging this scripting/research task as "help wanted" for now? I'm not certain how much time I'll have in the coming week or so. The task strikes me as a good fit for other volunteers should they have interest, and I wouldn't want to delay. Thank you. |
@zstumgoren I've started stepping through our list of CivicPlus domains using a modified version of generate_civicplus_sites.py so that we know we're using a clean list of domains. The script produces a csv that includes these fields:
Can you think of other fields I should be tracking? Should I separately pass each domain into |
Here's a csv merging the public list of URLs with the status_code, history, and alias fields described above: https://docs.google.com/spreadsheets/d/19t6vnl514kUyoSHKq3rMVA8y3O_hQ6KXk-HiUBB78xo/edit?usp=sharing |
Our list of ~1500 known Civic Plus sites largely run on subdomains of CivicPlus.
For example:
However, there appears to be at least one (and possibly others) that are only accessible via non-CivicPlus domains (presumably on a domain the government agency set up or manages itself).
Napa County is one known example:
This issue first cropped up in #63 and affects #80
The text was updated successfully, but these errors were encountered: