Boston University Gastronomy Program

May 5, 2021

Presentation Links

Slide 11: Animal Protein Example Visualization Here

Slide 17: Current Cookbook Progress Visualization Here

Slide 18: Visualization Filtered Fruit Comparison Here

Slide 19: Visualization Selected “Othering” Food Descriptions Here

Slide 22: Cooking Fats Visualization Here

Slide 23: Visualization Maria Parloa Mentions Map Here

Additional Visualizations Here

Additional Resouces Here

Presentation Here

Thank you List Here

Works Cited Here

Copy of Google Sheet (Navigate to ASFS-Table 1 Tab) Here


Work Flow

1.     Select possible cookbooks available digitally that you want to further explore. Ideally, you will already be familiar with the text and the cuisine.

2.     If pre-cleaned text versions of the cookbooks are unavailable, consider the length and use of non-standardized fonts in the cookbooks to determine the amount of time you are willing to spend cleaning up the text file.

3.     Organize any pre-existing notes or reference photographs you have related to the relevant cookbook, possibly in a research management tool such as Tropy and/or a carefully labeled file.

4.     If needed, use an Optical Character Recognition tool to convert the PDF to a text document.

5.     Carefully save the text document, keeping the digital surrogate open for reference.

6.     Clean up the text document and save the document again. This can be a long process so take breaks as needed.

7.     Download AntConc or another textual analysis software.

8.     Open the text document in AntConc (or other application).

9.     Navigate to the Word Count Screen.

10.  Note any strings that seem strange and continue cleaning the text document.

11.  Resave the text document and reopen in AntConc, returning to clean the text document and resave as many times as needed.

12.  Create a Google Sheet with one column titled “Unique Term” and another titled “Frequency.”

13.  Copy and paste the lists from AntConc into the columns of the Google Sheet.

14.  Carefully name the Google Sheet.

15.  As you work with the Google Sheet, document any major changes made in an external document.  

16.  Brainstorm potential column headings for the Google Sheets based on personal research interests.

17.  Create the Columns, carefully checking for spelling mistakes.

18.  Repeat steps 3-12, and 14 with additional cookbooks, adding columns to the Google Sheet as needed.

19.  Combine Terms in the Google Sheet as needed, referring to the AntConc Concordance page and the Digital Surrogate, remembering to adjust the relevant frequency counts. This creates “unique terms.”

20.  Documentation when terms are combined into “unique terms.”

21.  Decide and document decisions around unstandardized food terms.

22.  Start Labeling “Stop Words”, that is, words you do not plan on using in analysis.

23.  Develop controlled vocabularies in labeled columns as needed.

24.  Add labeled columns as needed. In some cases, you may wish to only allow for a “Yes” entry into the column.

25.  Create a Tableau Public account, and download the application.

26.  Explore Public Vizs at https://public.tableau.com/en-us/gallery/?tab=viz-of-the-day&type=viz-of-the-day/.

27.  Import the Google Sheet in to Tableau Public.

28.  Select the Sheet (ideally there will be only one option) to use in the Application.

29.  Navigate to a Sheet, and drag Column Headings to the Rows and Columns sections as desired.

30.  Explore the various types of charts and drag and drop filtration tools.

31.  Explore the various optional details and labels for your data.

32.  If needed, return to the Google Sheet and make changes to the Spreadsheet, then returning to Tableau Public, under the Data Menu, refresh your data.

33.  Create a least two visualizations.

34.  Navigate to a dashboard at the bottom of the menu.

35.  Combine your visualizations into a single Dashboard using the drag and drop tools.

36.  Optional: Explore the various formatting options.

37.  Save the Visualization to Tableau Public, making sure that the box allowing the visualization to be updated from the Google Sheet. It generally updates automatically about one per day.

38.  Add hashtags to your visualization as needed.

39.  If changes to the visualization are needed, open the application and open the Dashboard from Tableau Public.

40.  Share your visualization with others.

41.  Work with additional sources, return to the AntConc tools, digital surrogate, or research photographs to answer questions raised by the visualization.

42.  Develop additional visualizations to follow up on questions raised by initial visualizations.

43.  Download static versions of the visualizations as needed. These can be difficult to properly format to work in Word-type documents.


Glossary

This Glossary is provided to guide the reader through possibly unfamiliar terms in the thesis text. This glossary does not claim to provide a nuanced exploration of any of these terms but provides an explanation of how these terms are used in this work. In most cases, I am paraphrasing much longer definitions which can be found on the Works Cited page.

Controlled Vocabulary: A preselected list of terms used when entering data into databases to assist in making the information accessible (Society of American Archivists 202l).   

Corpus Analysis: A type of textual analysis that allows comparisons between textual objects at a large scale. This thesis uses AntConc as the Corpus Analysis tool (Froehlich 2015).

Data Analysis: The labor of using statistical tools and knowledge to interpret data sets (Burnham 2020).  

Data Science: The labor of designing processes for managing and making accessible large sets of data for future analysis (Burnham 2020).

Data Visualization: Charts, graphs, and other infographics created to view patterns, outliers, and\or trends in large sets of data. They can be combined to form Dashboards that consist of visualization arranged to be used in conjunction with each other and often include additional text and/or filtration tools to assist in the relevant analysis. This thesis uses Tableau Public as the tool to create visualizations and dashboards (Kriebel and Murray 2018, 30). 

Digital Surrogate: A digitized version of a physical source that is stored in a specific physical location (SAA-ACRL/RBMS Joint Task Force on the Development of Guidelines for Primary Source Literacy 2018, 8).

Optical Character Recognition (OCR): Transforming digital images of characters to a machine-readable format (Society of American Archivists 202l).

Primary Source Literacy: The ability to find, interpret, evaluate, and use a primary source (Society of American Archivists 2021).

Schema: A formalized description of a data structure, frequently used to define a database in terms of tables and columns (Society of American Archivists 2021).

String: A sequential set of data treated as text by computer programs, therefore any numbers or special characters are only read as text characters. For example, when placed in a string, the number 8 does not hold any numerical value (WebsiteBuilders.com 2019).

Structured Data: Data that has been processed and manipulated to be stored in a managed relational database. This thesis uses spreadsheets as the relational database (Pierson 2017, 8).

Unique Term(s): A word or compound word used in the spreadsheet that has a precise meaning, and may have been combined from multiple terms used in the cookbooks. This project combined terms, such as “orange” and “oranges” that have similar meanings to create unduplicated expressions of specific foodstuffs, cooking technologies, or specific concepts (e.g., “nurse” and “nursing” when used as verbs). This allows for the comparison of similar terms in visualizations. I adapted the term from the content management system Drupal, used to manage Web sites (Drupal 2021).