git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How to compare words from .txt file against words in .xlsx file via Python? I will then extract these words by writing it to a new .xls file


On 2019-08-04 09:29, aishan0403 at gmail.com wrote:
> I want to compare the common words from multiple .txt files based on the words in multiple .xlsx files.
> 
> Could anyone kindly help with my code? I have been stuck for weeks and really need help..
> 
> Please refer to this link:
> https://stackoverflow.com/questions/57319707/how-to-compare-words-from-txt-file-against-words-in-xlsx-file-via-python-i-wi
> 
> Any help is greatly appreciated really!!
> 
First of all, in this line:

     folder_path1 = os.chdir("C:/Users/xxx/Documents/xxxx/Test python dict")

it changes the current working directory (not a problem), but 'chdir' 
returns None, so from that point 'folder_path1' has the value None.

Then in this line:

     for file in os.listdir(folder_path1):

it's actually doing:

     for file in os.listdir(None):

which happens to work because passing it None means to return the names 
in the current directory.

Now to your problem.

This line:

     dictionary = cell_range.value

sets 'dictionary' to the value in the spreadsheet cell, and you're doing 
it each time around the loop. At the end of the loop, 'dictionary' will 
be set to the _last_ such value. You're not collecting the value, but 
merely remembering the last value.

Looking further on, there's this line:

     if txtwords in dictionary:

Remember, 'dictionary' is the last value (a string), so that'll be True 
only if 'txtwords' is a substring of the string in 'dictionary'.

That's why you're seeing only one match.