In gathering my corpora, I decided to collect the folklore stories from the Project Gutenberg website. When collecting each corpus from each country, I wanted to have a diverse amount of folklores from each country yet with the same amount from each country so that it wasn’t biased towards one country or another. When I looked into the stories, if he stores were somewhat short, I decided to include more tales from that country into the corpora from that specific country.
After I gathered three different corpora, I decided to look more into them by using tools like Word Trees and Word Counter to see what words were of important in each corpus, and decided to have a group of general word terms that could be used to compare each copra with. I decided to use common translated words such as ‘good’, ‘evil’ ‘king’, ‘fight’, ‘girl’, ‘boy’, ‘man’, ‘woman’, ‘monster’, ‘house’, ‘faith’, ‘love’, ‘hate’ and ‘island’. The word analysis “island” was added just to see how the Europeans and Russians view islands from their perspective versus how Asian culture viewed islands, just out of curiosity, not to throw off everything! These common set of words that I used showed that mortals like children (girl/boy), women, and men were usually involved in the tales. Gods, and values that teach good and evil were usually involved, fighting/battles are involved, and love/faith attributes were usually in the folklores too. This was the common thread between all the folklores from each culture.
When training the model, I wanted to see the graph plots. This gave me an idea of seeing how the words distribution was for each corpus from each culture. From those common set of words which I decided to use. I saw results in many interesting similarities and differences. I decided to use Word2Vec model ‘closest to’ for one word and two words. The reason why I decided to use ‘closest-to’ of either one word or two words were under consideration of multiple factors from my gathered research. Since I am collecting multiple folklore from many countries and cultures around Europe, Asia, and Russia, there could be a mistranslation in every story. Since different words could be used to describe possibly similar ideas or stories. I figured that using analogies would be under the assumption that these stories that I’ve gathered were translated very similarly. Another interesting note that I found and would like to the point out is the similarity and differences between all three cultures folklore stories.
From my overall conducted analysis into the Asian folklore stores from Project Gutenberg, I was able to see more various and diverse words being clustered. In the clusters from the European folklore, I found that there were more words grouped up to more noble and western ideas such as “honor” “mighty” “faith” “horse” “chariot”. I found it interesting that when war and fighting was involved in both the European and Asian folklore, the Asian tales had more explicit words used around war such as ‘savage’ ‘captain’ ‘monsters’ ‘fierce’, yet for the European tales, words such as ‘brave’, ‘mighty’, and ‘valiant’ were being used for war instead. As I stated previously the Asian folklore have resulted in being variant and sparse, but nature, war, animals, and island was a main idea for the Asian corpus. While the European clusters have more groupings of more abstract ideas like emotions, and honorable. On the other hand, the Russian folklore, didn’t showed somewhat of a mix yet and originality. The Russian folklore, had shown results of explicit violence such as “chains” “kill” and “avenging”, but it also showed values of stringent societal values such as words associated with ‘woman’ was ‘crone’, and words associated with ‘faith’ were fidelity and crime. Other fascinating observations that I noticed were words associated with food in Russian folklore were meat and liquor, for European folklore were wine and starving, and for Asian folklore were bread and rice. This shows the connection between Russian and European culture in food, that alcohol is probably a common thread between the two cultures. As for nature, Asian folklores view is as ‘poetic’ almost giving it a positive artistic value, European folklores associate it as an ‘enjoyment’ or ‘action’ which shows that there is a common enjoyment in physical activities in nature for humans, and for Russian folklores there seems to be an associated with ‘myths’ and ‘fiction’ which shows that there could also be a disassociation between Russian and Asian/European cultures. This shows that here isn’t also a clear relationship between the three cultures.
The questions that I faced when drawing connections and relationships between the different cultures being too broad or too specific were similar to the criticism that Aarne-Thompson and Vladimir Propp when there were investigating the relationships of the Proto-Indo-European folklores and Russian folklore as well. Aarne-Thompson gave broad categories to aspects of folklores across the cultures, almost like giving tags for each fable. Yet Vladimir Propp argued that folklores should be categorized by actions or functions of the storylines. I found this to be interesting that both are trying to compare, contract, and draw relationships between similar lines of folklore, yet they are trying to compare them differently and see different ways of comparing them. I can understand both strategies, and therefore that it another reason why I decided to use basic words and comparisons in my word2vec model in order to draw relationships between the Asian, European, and Russian folklores. By having basic key words used to compare the copra, I figured I would be able to draw lines of similarities and differences of the perspectives between each cultures values and ideals. Mainly due to the fact the folklores are interesting since they act not only as an artifact that can explain culture, but it acts as a way of ‘explaining’ history as well. Whether they folklores are actually true or not, they give an explanation or insight on how groups of people from different parts of the world sought to give explanations on occurrences or the world/life in general which is interesting. Although connections can be drawn, each country’s folklore is distinctly different. This is why models and research can be driving to show deep layers of history and culture that can be explain through word2vec.
“Folklore (Bookshelf).” Project Gutenberg, www.gutenberg.org/wiki/Folklore_(Bookshelf).
Giaimo, Cara. “The ATU Fable Index: Like the Dewey Decimal System, But With More Ogres.” Atlas Obscura, 14 June 2017, www.atlasobscura.com/articles/aarne-thompson-uther-tale-type-index-fables-fairy-tales.
Goble, Warwick . English: An illustrátion by Warwick Goble for Beauty and the Beast, 1913.commons.wikimedia.org/wiki/File:Warwick_Goble_Beauty_and_Beast.jpg. Beauty and the Beast, 1913