Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive of Hong Kong Legco Votes #1

Open
siuying opened this issue Mar 9, 2013 · 10 comments
Open

Archive of Hong Kong Legco Votes #1

siuying opened this issue Mar 9, 2013 · 10 comments

Comments

@siuying
Copy link
Member

siuying commented Mar 9, 2013

any idea on how to do this?

@alanho
Copy link
Member

alanho commented Mar 10, 2013

i afraid i overcomplicated this. the first question would be how to convert old record into machine readable one. either we do OCR/image processing.. and for future votes, i'm in touch some contacts that have access to LegCo Secretariat to see if they can improve the system to store machine readable format instead of pdf scannings

@sammyfung
Copy link
Member

It is difficult to use OCR to process those scanned records into machine readable one. For LegCo meeting minutes, I checked few days ago and found that all scanned records (minutes only) from 1985 are replaced, now most of them are in 'machine readable' PDFs.

And Charles Mok said on Saturday in HKLUG/ITFest seminar that LegCo Secretariat is preparing machine readable copies of voting results, but we dunno when will be completed.

@siuying
Copy link
Member Author

siuying commented Apr 22, 2013

good news! 👍

@alanho
Copy link
Member

alanho commented Apr 22, 2013

any links to those machine readable PDFs?

it doesn't need to be full or perfect OCR, might be a simple histogram will do the job? it's either yes, no or abstain.

but if somebody will let us know LegCo will convert old data to machine readable one, we could just wait for a bit. if not, maybe someone could get in touch with a computer vision lecturer/professor in local U and get this task as an assignment for students ;D

@siuying
Copy link
Member Author

siuying commented Apr 22, 2013

@alanho http://www.legco.gov.hk/yr12-13/chinese/counmtg/motion/mot_1213.htm#toptbl it will need some smart algorithm to extract them into structured data, though.

@alanho
Copy link
Member

alanho commented Apr 22, 2013

so far.. this is all i have time for.. a script to extract all voting result into images or PDFs..

https://gist.github.com/alanho/5433521

@siuying
Copy link
Member Author

siuying commented Apr 22, 2013

not very usful if we just got the image and need manual intervention. Perhaps just write a scraper to extract data from the app: https://itunes.apple.com/hk/app/yi-yuan-biao-xian-lu/id549783193?mt=8 where they are manually input data

@alanho
Copy link
Member

alanho commented Apr 22, 2013

interesting app. didn't know about this app! but i guess they don't have all the voting result do they??

@alanho
Copy link
Member

alanho commented Apr 22, 2013

the target is to build a database that these guys can use, so they don't have to worry about data input, just focus on presenting the data in their own ways

@siuying
Copy link
Member Author

siuying commented Apr 22, 2013

I think they have all voting result in the period, but only 2008-2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants