PoliticalMashup Transformers

Version e7ce242b409f2680ba5b97c2a07b7a51f3c62cba Last updated August 09 2016 14:40:57.


TWIKI: http://ilps-twiki.science.uva.nl/twiki/bin/view/Main/WebHome?topic=ParliamentTransformers
GIT: get your local copy for editing:
$ git clone git@bitbucket.org:ilps/pm-transformer.git

EXIST LOGS: http://monitor.politicalmashup.nl/monitor/logs

TODO: make this README a readable format, as it is included in transformer.politicalmashup.nl/index.php

PULL INTO transformer.politicalmashup.nl
ssh mashup2.science.uva.nl
sudo su ilps_bg
/scratch/tools/git/pull.sh transformer

All info in 

Remember to clone DutchParlSchema if you need it:
$ git clone git@github.science.uva.nl:politicalmashup/schema.git 
(an updated clone lives on http://schema.politicalmashup.nl/)

TODO: most information below is out of date. Update the twiki page, and fill this README (for people who can not read the twiki)

Addedum about below: this is   (non-breaking space), which is obviously whitespace, but is apparently not cleaned by normalize-space().
There is a strange "invisible" character that every now and then crops up in the swedish data (perhaps norwegian too). normalize-space() does not remove it,
but a replace(.,' ','') does (for html display, that is  ).
replace(.,' ','') should too (note the whitespace looking char. It should not be whitespace, and hexdump get's confused. No clue, but non-disappearing space might be good to remove everywhere.

with set-xml-base.xsl on se/live/se-ge-2003.xsl

(re)move: folder nl/ with example documents

(re)move: earliest two se periods (1970-1990), and clean up any old se stuff

clean up: scripts (transform/validate etc.) move to subfolder script, or delete alltogether

move: nl, new and old (!) to its own folder
- nl-draft and nl-officielebekendmakingen-ge-2011.xsl are to use the new structure (common/live/ includes)
- nl-officielebekendmakingen.xsl (and its Preprocess?) are older and more or less standalone, and will probably remain as such for now.
- nl-sgd.* is trickier, uses several pre/post processing steps and the old All2Pol.. includes. Long-term here also, to rewrite to new format.

- N.B. it is not immediately important, but desirable to also convert older nl-* xslt to the new format.
  This makes them easier to maintain, for when we want to re-run all data and include new structure (new meta-fields perhaps?).

(remove: make sure (after nl) they are not needed anymore. Then delete all All2Pol... in the root, remove CountrySpecificTemplate.xsl)

about nl order:


XML files

XSL files

Other files