git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Help on Code logic to remove duplicate mails from webapp mail box


On Wed, Oct 30, 2019 at 6:36 PM eman banerjee <emanbanerjee at gmail.com> wrote:
>
> On Wednesday, 30 October 2019 12:40:06 UTC+5:30, eman banerjee  wrote:
> > Hi
> >
> > I am working on a project where we make connections to webapp mail and extract subject, sender,body etc from mails and save it in dataframe and insert it n SQL DB.
> >
> > My next challenge is to remove any duplicate mails from mailbox.
> > Could you kindly help me.
> > It can be a new mail which is entering the mailbox is first checked , if its a duplicate email,it will not be inserted to mail box
> >
> >
> > Thanks
>
>
> code is below
>
> class EmailAnalysis:
>
>     '''Method help to communicate with outlook'''
>
>     def emailExchangeCredentials(self):
>         try:
>         except:
>             account = 'Failure'
>             return account

This is hiding useful exception messages. Don't do this.


>     def extractEmail(self, account, Email_Data, conn):
>
>         self.Email_Data = Email_Data
>         SUBJECT = []
>         SENDER = []
>         JOB_NAME = []
>         EMAIL_DATE = []
>         EMAIL_BODY = []
>         REMEDIATION_ID = []
>         PRIORITY = []

Usually, all-caps names indicate constants, so this is potentially confusing.

>                     sql = "INSERT INTO ***DBname** (SENDER,SUBJECT,EMAIL_BODY,EMAIL_DATE) VALUES ("+"'"+str(item.sender.email_address)+"',"+"'"+str(item.subject)+"',"+"'"+str(email_body)+"',"+"'"+str(date_)+"')"
>                     print('SQL :- ',sql)
>                     result = cursor.execute(sql.encode('utf-8'))

This is a VERY VERY bad idea. Don't do this. Use parameterized queries.

>                     item.move(to_folder)
>                 else:
>                     item.move(to_folder)
>
>             else:
>                 item.move(to_folder)

Not sure what the logic is here or when you wouldn't move the item to
the folder.

>     def emailAnalysisController(self,Task_Details):
>         try:
>             exection_status = 'Success'
>             return exection_status
>         except:
>             exection_status = 'Failure'
>             return exection_status

Python already has a very good mechanism for reporting failures. It's
called exception handling. I recommend using it, rather than absorbing
exceptions to give a generic return value.

Once these issues are sorted out, you'll be able to think about
duplicate detection.

ChrisA