Categories


Authors

Initial Work on my Paddington and Marmalade Sandwiches Proposal for the Oxford Food Symposium

Paddington Boxed Set Unbound

Last spring, I finished up the BU Gastronomy program by creating a methodology to compare word frequency in cookbooks. This involved a long process that included finding digital txt versions of the cookbooks and converting them into word frequency visualizations in my Tableau Public account. As I did not realize that the process would be my thesis, I did not adequately document my process and small, but significant decisions, but had to recreate them later. Since graduation, I have learned the business and academic terms for several of the steps in my methodology and wished I had done several things differently early in the project. My current thought process has been guided by conferences I attended virtually in Summer 2021, The Data Sitters Club, and my current enrollment in a Data Analytics class at General Assembly (remote).

As with any methodology, the question I am now considering is, how can I improve my thesis methodology in a new project. My current project idea came from a prompt from the Oxford Food Symposium call for paper proposals on Portable Food. 

While in the Gastronomy program I had worked on a project exploring food in Paddington Bear books, television shows, and movies. I vaguely recalled that portable marmalade sandwiches were present through the books and decided to start exploring the foodstuffs frequency and significance. I decided to start by scanning a boxed set of Paddington stories.

I am currently creating txt versions of the books in the boxed set for distant reading, aided by technology. Here is my workflow (subject to revision) for this current work stage from my methodology spreadsheet.

  1. I started with the Box Set as the Initial Paddington Canon

  2. Set Goa to Visualize frequency of portable food mentions using AntConc, Google Sheets, and Tableau Public

  3. Realized and accepted that I would need to destroy the physical books for this project

  4. Removed pages from binding

  5. Tried to scan pages using automatic feeder at 200 dpi

  6. Became frustrated that the small pages kept getting stuck in the scanner, and that I was scanning odd than evens pages to be rebuilt in Adobe Acrobat

  7. Spent some time scanning each page, which meant that pages were in the correct order, but took hours

  8. Placed physical scanned documents into correct binding for future questions around page/book order and rescanning as needed

  9. Placed scans into dedicated Dropbox folders for each day scans

  10. Made an initial spreadsheet to list Paddington Stories by book, year, and illustrator

  11. Currently rearranging scanned pages in Adobe Acrobat, making each story its own binder

  12. Currently running Optical Character Recognition (OCR) on scanned pages

  13. Designed Spreadsheet with initial columns

  14. Getting ready for more Scanning at 600 dpi

  15. Planning to make a Dictionary Defining Columns for the main spreadsheet

  16. Need to create txt files from the scanned pages

  17. Making plans to clean txt files

  18. Plan to rescan Specific pages as needed

  19. Plan to import virtual books into Tropy with accurate metadata

  20. Plan to clean Data with documentation of cleaning practices

  21. Need to consider new copyright rulings that would have made this whole process easier


The Importance of Documentation in Personal Research Projects

Embracing My Reading History