Tuesday, December 20, 2011

The Transparency Project in 18 Minutes

Donald Regalmuto has put together an edit of my presentation in Marin County in October 2010. (It even includes my first ever attempt at animation, done with the open source program Synfig Studio. Even though it's extremely basic and lasts only about 15 seconds, I'm really happy to have animated something.) The video is about 18 minutes and may be of interest and/or use. This was put together thanks to Lori Grace.

If the version below is clipped, you can see it directly at You Tube:


Sunday, November 20, 2011

Updated November 2011 Election Results

After I manually scanned through all ballot images to search for and make any corrections to the TEVS results, I located 35 vote targets whose software interpretations needed correction. Mostly, these were due to extremely light check marks. The scan of the 12,000 images, with votes overlaid on them by the software, went at about 90 to 100 images per minute, or 90 to 100 ballots per minute, since these were single sided ballots.

Having made those corrections, my independent results off of the Transparency Project's images match the county's Final Unofficial Results exactly except for the following:

Dave Saunderson 1559 versus county report of 1557
Emil Fierabend 441 versus county report of 442
John W Corbett 1377 versus county report of 1376
Joe O'Hara 104 versus county report of 102
Judy Gower 162 versus county report of 164
Mike Seeber 55 versus county report of 54
Susan Johnson 3104 versus county report of 3103
Zachary Thoma 16 versus county report of 17

As my interpretation of votes may not have followed the official rules, it is very reasonable to assume that the errors are in the independent count. None of the discrepancies would change any results.

The only reason I am not calling this "entry" final is that we may have an opportunity to do multiple-person recounting of the races with discrepancies.

Thursday, November 17, 2011

Preliminary Results, Nov 2011 Elections

Given the dismal 22% turnout, there were only 12,269 ballots for the Transparency Project to scan. We began with 3.5 hours of scanning on Veteran's Day, and then 3 shifts this week totaling 11 more hours.

I ran the scans through TEVS yesterday and have manually counted the (fewer than 10) ballot scans it rejected. I also checked for votes on about 40 ballot images when a vote region was indicated as just a smidgen above the darkness cutoff. The results are still missing for one ballot, where we mistakenly scanned the blank reverse. That ballot will have to be retrieved from the County's ballot storage.

The results compare well with the Final Unofficial Results posted at the County web site.

I'll point to a spreadsheet once I've been able to check the results more thoroughly, but here
are screen shots of the spreadsheet in its current form. The write-in results are incomplete, due to my mistake, but the other results are mostly within 1 of the Final Unofficial Results, with a couple off by 2.

Note that some candidates have more than one line, where optical character recognition generated multiple versions of their name and I did not merge them (many such variants have already been merged in software -- the ones that show are the ones I missed).

Tuesday, November 1, 2011

If I can shop online, why can't I vote online?

An important explanation of a common question, by David Jefferson, at the Election Law Blog.


Thursday, October 20, 2011

Progress towards a graphical UI for TEVS

I'd hoped to have a tested graphical user interface version of TEVS ready before November, but that has proven impossible. As I won't be able to touch this for some time, I've posted the existing code in an alternate repository, tevs.gui, at tevs.googlecode.com. Though it works soup-to-nuts for scanning, processing, counting, and reporting at my setup, it is untested, undocumented, and incomplete.

Here are screenshots, from which behavior can be inferred. Clicking the images will get you larger versions.

There are three control panel tabs, corresponding to the stages of the process.

First tab on control panel is for scanning. The GUI calls out to a separate scanning process, which gets the scans and writes them to files. The GUI then reads the files. Controls allow for choosing endorser if available, changing resolution, selecting duplex/simplex and ballot size. If a scanner is not found (unplugged, bad connection) the user can check and click "Search for scanner" to connect. If it is present, its type is displayed. More than one scan source is not now handled.

Second tab is for processing scanned ballots into a vote database, getting info from the database, signing and writing all info to a DVD. The signing and writing process is done by prompting a user through a terminal window dialog with tar, gpg, etc..., and encouraging them NOT to do the signing on the actual machine running TEVS.

Decisions about how to insist on an educated user have been the most difficult part of this process -- it is pointless to produce a tool that is simply trusted. The disk's user is free to alter the database at will -- the assumption here is that the group using the tool is independent of the elections office and trusts itself

Merging OCR variants -- templates are built automatically on supported vendor ballot designs when the system encounters new ballot types (as in new precincts, for example), but the OCR in building the templates will sometimes read the same contest or choice with different errors from one precinct to the next. The system makes its best guesses and then asks you to confirm or alter associations between different variants of contests and choices before it totals things up. To ignore this, all one needs to do is click "Done" and the system skips on to overvote processing and counting up the votes.

Showing results -- they can also be put into a PDF and printed.

Third tab is for displaying results on ballots. You can walk through ballots sequentially. You can also query the database for ballots with particular characteristics (ambiguous votes, overvotes, particular precinct, etc...), and can click through the resulting list to see those ballots.

The user interface is done using Gtk, and its appearance is generated via Glade, a user interface design tool. This makes it easily alterable and translatable. Tasks like scanning, database access, and ballot processing are farmed out to other processes, generally slight modifications of the non-UI routines that already exist.

Friday, September 2, 2011

Scotia August 2011 Results

We've scanned the 151 ballots for Scotia's August 2011 election and counted them with TEVS. The results are below.

Interestingly, because one voter voted for three candidates and then wrote them in, marking three write-in boxes, that voter's votes in the Director contest were considered invalid -- only five boxes can be selected, and if more than five are selected, none are counted. That ballot, with the voter's handwriting removed, is the image you see above.

Because this is such a small set of ballots, we've chosen to hide the handwriting of the write-in votes by modifying the few scans with write-ins to erase the handwriting and replace it with standard text. The scans are here (29 megabytes), the digital signature to validate the scans is here, a screenshot of TEVS showing the first ballot is here, and the detailed spreadsheets are here (as Open Office) and here (as Excel).

(Note: "ambiguous" in this context doesn't mean the voter was ambiguous, but that TEVS had contradictory results on the two different tests it uses to decide if a mark is enough to register as a vote. Such votes are hand-checked.)

count | choice_text
70-1 | RICK WALSH (one vote labeled ambiguous inspected and removed from ballot 77)
140 | YES
9 | NO
7 | Writein
(12 rows)

Not included above due to "overvote" in contest:

substring | choice_text
roc/001/001067.jpg | GAYLE MCKNIGHT
roc/001/001067.jpg | JAMES BARNES
roc/001/001067.jpg | JOHN CANESSA
roc/001/001067.jpg | MARILYN SANDERSON
roc/001/001067.jpg | WILLIAM BILL STEPHENS
roc/001/001067.jpg | RICK WALSH
roc/001/001087.jpg | GAYLE MCKNIGHT
roc/001/001087.jpg | JAMES BARNES
roc/001/001087.jpg | JOHN CANESSA
roc/001/001087.jpg | MARILYN SANDERSON
roc/001/001087.jpg | WILLIAM BILL STEPHENS
roc/001/001087.jpg | RICK WALSH
roc/001/001088.jpg | CAROLYN DEPUCCI
roc/001/001088.jpg | JOHN BROADSTOCK
roc/001/001088.jpg | MARILYN SANDERSON
roc/001/001088.jpg | Writein
roc/001/001088.jpg | Writein
roc/001/001088.jpg | Writein
roc/001/001026.jpg | JAMES BARNES
roc/001/001026.jpg | Writein
roc/001/001026.jpg | Writein
roc/001/001026.jpg | Writein
roc/001/001026.jpg | Writein
roc/001/001026.jpg | Writein
(24 rows)

Friday, July 15, 2011

A Voting System without "Spoilers" -- Approval Voting

A perpetual problem with our current voting system is that it subjects third-party candidates to charges that they are "spoilers" while forcing voters to vote strategically rather than honestly. That is, if you really like candidate C, but realize that the most likely winner is either candidate A or candidate B, you may feel obligated to vote for A to prevent a win by B.

How hard is it to get around this problem? Not hard at all -- it just means considering the system that was used to vote for the first four U.S. presidents. That's Approval Voting.

Under approval voting, you can vote for as few or as many choices as you like in each election. Then, every vote for every candidate is added up, and the candidate with the most votes (approvals) wins. Easy.

More here, at the Center for Election Science.

Tuesday, June 7, 2011

TEVS Live CD now online

One of the most frustrating things about the TEVS software has been its dependence on a modified version of an open source image utility library. It was an early mistake but, because it worked for me on my setup, it wasn't until very recently that I removed the dependency on this modified version. The dependency made it needlessly difficult for others to build TEVS. But it's gone now.

In addition, a lot of cleanup, documenting, and restructuring of the ballot handling code has been done by James Frasche, thanks to funding provided by Lori Grace and the Grace Institute for Democracy and Election Integrity. (Thanks!) This will make it easier for people to add handling for new ballot types going forward.

So it finally makes sense to make a Live CD available to those who might want to have an environment in which they can run the code, figure out what they want to add or improve, and just add and improve it.

The Live CD is a version of the Ubuntu Live CD which includes TEVS, all the programs it uses, a few sample ballot images, and a link to the code repository at tevs.googlecode.com. While it's NOT slick, it does provide a few buttons that allow you to run TEVS on a small set of provided ballot images and see the results, see the XML templates that get built, then run it again on the same images to re-use the templates, then run it on the images at a lower resolution, to see the difference in speed between 300 dpi images and 150 dpi images. Because the Live CD's 900 meg size won't fit on a CD, it needs to be burned onto a DVD. Once you've burned it to a DVD, though, you can just pop it into any PC or modern Mac and run Ubuntu and TEVS without installing them on your machine.

If you've never used Ubuntu Linux, don't worry. It should be perfectly straightforward how to use things for the demo. You can also see how easy it is to use Ubuntu for things like web browsing and general office work.

The Live CD should be a good tool for demonstrating the possibilities TEVS provides. It can be downloaded from this link: http://dl.dropbox.com/u/18212385/customdist.iso

Monday, March 21, 2011

TEVS Source Code Update Online

A Mercurial repository of the TEVS source code is now online at tevs.googlecode.com.

The entry point for the data extraction code is at tevs/main.py, and the generic ballot handling code is at tevs/Ballot.py. The display code, in tevs/TEVS.py, will be undergoing major modifications.

The reason for putting the code online at this point is not that it's ready for prime time, but that the data extraction part has now been restructured to the point that it's no longer mostly a waste of time to use it as a starting point.