I processed structured data out of the PDF and built a searchable interface: https://open-register-of-pecuniary-interests.joshmcarthur.co....
You can search across all MPs' disclosed interests by name, company, or interest type. For example, you can quickly find which MPs have interests in specific sectors or companies, filter by category or political party.
The data extraction was interesting - I found that a two-pass approach worked well with Gemini 2.5 Flash - one to pull out MP names and referenced page numbers, then I extracted the specific pages each MP appeared on and extracted structured data just from these pages.
The approach could work for similar transparency registers in other countries - most seem to publish open data as PDF, which technically ticks the box, but isn't the most accessible format to work with. Even within NZ, I'm planning to expand the data I process to previous years, as well as processing data for local and regional councils (who have the same legal requirement to publish financial interests of council members).
Open sourced at https://github.com/joshmcarthur/open-register-of-pecuniary-i....
Tech stack: Ruby on Rails, SQLite (FTS5), Tailwind/DaisyUI - keeping it lightweight since this is just a side project to make public data more accessible.
loading...