Frequency of two different string (with it's derives) in the large book
Analyzing text: The presence of two similarities variables in a book or large text
Imagine we have a large book or large text, and have our var1 and var2 and want to see where in the text line the two variables are meeting each other always. The goal is to see the relativity of two variables in a context of a book or large text.
Consider we have a large book of Mr. Kane and we extracted a
derives of verb share, then my var1 =
share and as the following share with its derives :
share = [‘ shared ‘, ‘ sharing ‘, ‘ sharer ‘, ‘ share ‘, ‘ sharers ‘]
My var2 can be any verb that I showed frequently next to var1
(share) during my lecture, let’s say confine and:
confine = [‘ confinement ‘, ‘ confiner ‘, ‘ confine ‘,
‘ confines ‘]
Our two variables contain only a key but not a value, we don’t know
how many times one of the derives appears in the text, or in other word the
frequency of each derives, we will add some value then:
share = [(‘ shared ‘, 45), (‘ sharing ‘, 32), (‘ sharer ‘, 27), ( ‘
share ‘, 23) ,( ‘ sharers ‘, 19)]
confine = [(‘ confinement ‘, 43), (‘ confiner ‘, 29’), (‘ confine ‘, 26), (‘ confines ‘, 16)]
We will create a function which will help us for further research:
>>> def var_in_other_var_in_book(var1, var2):
filename =
open('book.txt')
rb = book.read()
sb = rb.split()
sl = rb.splitlines()
k_1 = [k for k,v in
var1]
k_2 = [k for k,v in
var2]
for tom in k_1:
for tom in
k_2:
for
w in sl:
if
tim in w and tom in w:
print(w)
Now, where in the book or in a large text share and confine are
meeting in same phrase. Let’s add confine and share in our function to get the result
>>> var_in_other_var_in_book(confine, share)
Don’t forget to specify your text file name in the function.
Comments
Post a Comment