how was the QUISP spreadsheet created and why is its data kinda shitty?

quisp home

how was the QUISP spreadsheet created?

I built the QUISP spreadsheet by going to the digitized Phoenix archive on Triptych, entering keywords I thought would bring up content relevant to queer and trans content (i.e. lesbian, gay, bisexual, trans, transgender, LGBT, LGBTQ, genderqueer, queer, SQU, GLU). I sorted the results by date, because I figured that there would be a higher density of material in more recent issues, and because moving through the material by date (as opposed to by number of keyword hits, or at random, for example) would help me understand the material narratively.

Then, moving backwards through time, I would search where the keywords were picked up in each issue, and decide whether each article warranted cataloging. If I decided it was, I would enter its metadata into the Google spreadsheet: Article name, author name, issue date, tags, and description.

The tags and description are both subjective categories. I created a running list of tags I'd come up with, and tried to link as many relevant articles to each other as possible. The descriptions are fairly lengthy, and are mostly made up of quotes from the articles themselves. I figured that robust descriptions would reduce the need for future researchers to expend unnecessary labor scouring the archive for content they were interested in.

At the moment, this archive only goes back only from April 2011 to as far as 2001, and only includes Phoenix articles. Also, for some reason, the Phoenix issues from 2006 are not on Triptych, so my catalog doesn't include any articles from that year. I've brought this issue up to the folks at Swarthmore libraries, and hopefully they will fix it shortly!

why is its data kinda shitty?

The short answer is: I'm only human! I'm lazy, not getting paid to do this, and a mediocre typist. Sorry!

The long answer is:

  1. As I noted on the home page and on the why queer archives? page, queer and trans history is often hidden. In the Phoenix, these histories are surely disguised by now-extinct slang or student groups, or else more deeply coded in articles about clothing, body language, sex and love, political conservatism, etc. If I had chosen "sex" or "gender" or "body" or any other relevant word to queer and trans life, but not always used in those contexts, I would have been stuck sorting through an immensely huge amount of material. I chose to begin with the obvious keywords, for my own speed and ease. I wouldn't go back and do this differently; I'm confident that this method generated much of the content I was interested in finding, but it certainly didn't find all of it. This is especially the case for the transgender history of Swarthmore, which I found to be mentioned explicitly very rarely. Cross-dressing, however, often included under the umbrella of trans history, was found not so infrequently, in the context of drag performances or the annual Genderfuck party. This example of the codedness of trans history suggests the incompleteness of this archive.

  2. I didn't include all of the articles with keyword hits in the spreadsheet, and I made those subjective decisions by myself and with no hard-and-fast rule. Often, the keywords would be snuck in with other identity categories in an article about social justice or politics: "...fighting for low-income students, people of color, and the LGBTQ community" etc. Because the intent of QUISP is only to document material specifically relevant to queer and trans people, and not to material relevant to social justice in general, I felt that articles that only mentioned queerness in passing didn't "count." The general-social-justice article is only one of the many instances where a keyword was mentioned, but not discussed or commented on in a way that felt relevant enough to QUISP or that I imagined someone might want to access in the future. I feel that my definition of relevance was very broad, but certainly not static, and certainly shaped by my own assumptions and understandings.

  3. I generated a tagging practice as I went on. I obviously didn't have a list of tags before I began this process, so sometimes I'd decide on a useful tag only after I'd seen a few articles relevant to it, and I'd then have to retroactively tag articles. I'm sure I did not do this perfectly. The tags I generated are entirely subjective, and I wasn't using any strict rules about when to use these tags. I tried to be consistent, and I tried to make tags distinct and mutually exclusive.

  4. The descriptions of the articles are mostly quotes or paraphrases from the articles themselves. However, I obviously couldn't include the entire article, so I chose pieces I felt were especially important. I tried to give an overview of the entire contents of an article, but that's difficult for very disparate or very lengthy articles. Or, for example, if a homophobic incident was mentioned in a larger article about, say, the football team, I tended to only recount the homophobic incident in detail, and not the bulk of the article. The descriptions are the most dangerous, but also the most useful piece of metadata in the spreadsheet, in my opinion.

  5. The whole thing is riddled with clerical errors (my bad!) that I'm in the process of reducing. Google Sheets doesn't use spellcheck or autocorrect, and I'm not the most impeccable or attentive typist. There are missing or misprinted dates, names, titles, etc.

  6. The digital Phoenix archive is text-searchable because of error-prone Optical Character Recognition (OCR) technology. This means that the text of the Phoenix was read and generated by a computer. Bad OCR results in similar looking characters being mistaken for one another, so an article with the word "lesbian" might be OCR-ed as "Iesbian", "gay" as "yay," etc. While the OCR-ed text of the Phoenix is some of the more clean I've encountered, it's very possible that some of the keywords I was interested it were not picked up because of spelling errors on the part of the Phoenix archive itself.