The Five Filters
Descripción del proyecto / Descripction of the project
The Interactivos 09 workshop has ended but The Five Filters installation will be available to see and use until 22 March 2009 at Medialab Prado.
Edward Herman and Noam Chomsky described five 'filters' in their propaganda model to explain the output of the mass media: ownership, advertising, news sourcing, flak and anti-communism (recently anti-terrorism). The goal of this project is to explore these filters using web searches and physical newspaper clippings.
One idea, for the initial phase of the project, is a system made up of a camera/webcam taking still images of a table. Newspaper clippings are placed on the table. The images are processed and the text extracted using something like GOCR/Tesseract (open-source character recognition). The text is then used to identify the source of the entry and related information using a variety of web searches/APIs (e.g. Google, Nexis, etc). At this point, there are many options open to us and I'd like to discuss these and hear suggestions about how we can proceed.
Another idea is to use the text to find and retrieve related stories from a set of alternative news sources.
We are also thinking about printing the resulting news stories in newspaper format.
Areas to think about
- How should we present the information we retrieve?
- How should users interact with this?
- How to display the process (happening in code) to the user waiting for results?
Work to be done (in any order)
- Project poster (I'll try and find info about this)
- Capturing images with webcam and Processing
- Extract text using OCR
- Fix hyphenated words at line ending ("If the final word in a line ends with a hyphen, rejoin it with its remainder")
- Term extraction (web service might be quickest way to get started here - not sure, see below)
- Anyone with access to Nexis database? (I had access in the UK, but my Swedish university account does not offer access to this fulltext news database)
- Compile list of alternative news sources
- Test to see if searching (e.g. in Google) for a sentence taken from a news story results in a URL for that story online
- Lots more to be added...
Documentación (gráficos, fotos y vídeos) / Documentation (graphics, pictures and videos)
Day 1 (2009-01-30)
- Discussed project, how it would work, work to be done, etc...
- Produced sketch of the process (newspaper clipping as input, printed newspaper with alternative sources as one possible output)
- Created simple PHP script to get related stories from alt. news sites (uses Yahoo's term extraction and Google's Ajax search API)dich vu seo
Day 2 (2009-01-31)
- Webcam does not produce hires results when used in processing (tried both JMyron and standard cam library)
- Logitech Quickcam does produce good enough results using bundled software
- Set up Ubuntu system to see if command-line tools can be used to capture auto hires frames from webcam
- Raul trying to run perl term extraction software
- Alon tried RSS to newspaper services with good results (now trying to find open source tool we can use ourselves)
- Fernanda and Alon working on video, poster, ...
- Discussed project with Steve and Steve from useful sites suggested and
Day 3 (2009-02-01)
- Alejandro suggested (1) using physical objects for the filters and (2) using newspaper logos on card with QR code to identify the newspaper -- that might be something to consider if the OCR recognition is not good enough, Reactivision codes might also be useful as they will tell us position of card too.
- It's Sunday so we're taking a break.
Day 4 (2009-02-02)
- To get around the webcam problem (low-res when using video libraries within Processing), I tried booruWebCam. It works well (the old v1) and can capture frames every second to file. The file can then be read inside processing and processed using JMyron (using the hijack function).
- The idea is now to detect new convert the jpg to tiff (if we end up using tesseract), and run tesseract/ocropus to process the text.
Day 5 (2009-02-03)
- Problem with reading JPG within Processing - the file is being constantly written to so reads do not always complete. Catching the exception doesn't work either. My attempts to lock the file using Java FileLocking also failed (even a read lock would crash the capturing software). Pix suggested a script to move the jpg to a different file when writing has finished as that'd be an atomic operation, so no incomplete data will be read in.
- Looking into triggering the capture software by code - that should solve the problem. Pix looking into capture programs in Linux.
- Fernanda and Alon working on video - currently cutting up newspapers.
- Raul finished a diagram showing the process from newspaper input to screen output. Also suggested using a wiimote as a way to let users interact with system.
Day 6 (2009-02-04)
- Decided to try and show process of finding alternative stories through processing - possibly using traer's particle system.
- Webcam, processing and libraries now working in Ubuntu: using JMyron (linux version), uvccapture (for capturing frames to file) and uvcdynctrl (for setting webcam focus, contrast, brightness, etc. -- have to load logitech.xml file before setting focus)
- OCROpus a nightmare to install on Ubuntu 8.10 (Pix will attempt later) -- relying on Tesseract for now
Day 7 (2009-02-05)
- OCROpus now running on Ubuntu (yay!)
- Raul has camera configuration commands running from Java/Processing
Day 8 (2009-02-06)
- Need to detect motion to know when to capture frame (suggestions so far have been OpenCV as alternative to JMyron, and using the difference call in JMyron to check for black pixels -- black pixels indicate current and previous frame are identical)
- Need to test HTTP within Java - to send POST request and receive JSON response
- Alon and Fernanda finalising video
- Helena to join us tomorrow as a collaborator
- Raul has Processing running within Eclipse IDE and has classes that can do the capturing
Day 9 (2009-02-07)
- Motion detection now works correctly using JMyron's average() function. average() returns the average colour of a given area taken from the difference image. Passing this value to brightness gives us an average brightness value we can use to check if the difference image (the difference between the current and previous frame) has any movement.
- Added QR Code checks to check if newspaper has been placed or not.
- Helena helping Fernanda with image
Day 10 (2009-02-08)
- Moved everything from Processing's environment to Eclipse IDE and modified code to pass PApplet instance to the other objects
- Using Apache's HttpClient to execute HTTP requests using Java threads
Day 11 (2009-02-09)
- Moved Eclipse project over to GNU/Linux machine and combined code
- Application running very slow -- we had incorrect nvidia drivers for Ubuntu 8.10
- Raul and friend installed correct nvidia drivers after much hassle -- how?
- JMyron for linux didn't work inside Eclipse until the I moved the 2 required files to the the correct Java folders (processing contains java in the processing/java/ folder, Eclipse runs java somewhere else so the files need to be copied there)
- Fernanda finished images for video, but we need to send it to Alon now
- Fernanda working on poster
Day 12 (2009-02-10)
- We need to finish....
- modify php to select only URLs of full stories (not listings)
- create php to extract full story text given a full story URL
- use rss to newspaper service to print PDF
Day 13 (2009-02-11)
- Jorge rotated webcam image in Processing using pixel-math magic
- Fernanda working on poster
- Helena helped with poster translation and compiled propaganda model summary in Spanish (we need to print out copies for people)
- Raul tested Tabbloid and it works well. It's not free software, but it will do - and they have a developer api macbook pro
- Alon having problems downloading image for video (very large file)
- uvcdynctrl command sometimes complains when importing logitech.xml - need to be root to get around the problem
- free cna course online
- Still to do...
- Show details of news stories (title, description, full story) - look at JDIC
- Produce PDF to print
Day 14 (2009-02-12)
- Poster complete and printed
- Alon sent us very cool video
- JDIC didn't work out very well (couldn't use mozilla, only IE, so not sure if it'd work on the Ubuntu machine) - will try to implement it later
- PDF is now being generated by Tabbloid and printed automatically. To make it work we create an RSS file from the selected stories, use the Tabbloid api to point to the RSS file, generate the PDF, wget the PDF, turn it to .ps and send that to the printer. So far, it's working well
Tecnologías y herramientas / Technologies and tools
This is a list of possible technologies that we will look into.
- Yahoo's term extraction web service (I don't want to rely too much on web services that we can't control, but this might be useful for quick prototyping)
- Rita text library (thanks Georg)
GNU/Linux commands for capturing frames and OCRing images
uvcdynctrl -i logitech.xml (imports logitech config to allow focussing to be set)
uvcdynctrl -s Focus 145 (set focus to 145)
uvccapture -oframe.jpg -m -x640 -y480 -C150 -S1 -B150 (capture to frame.jpg at 640x480, -C contrast -S saturation -B brightness)
ocroscript deskew frame.jpg output.png (deskew and fix rotation of text, saves to output.png)
ocroscript recognize --output-mode=text output.png > text.txt (saves text in output.png to text file)
Autor del proyecto / Project's Author
My name is Keyvan Minoukadeh and I am currently studying Interaction Design in Gothenburg, Sweden.
Colaboradores / Collaborators
- Fernanda Reis
- Alon Chitayat
- Raul Dominguez
- Jorge Dueñas Lerín
- Helena Piñán
Photo 1: Group shot (missing Alon): Raul, Jorge, Helena (holding her cute but shy son), Keyvan, Fernanda
Photo 2: Alon and Fernanda