PLEASE USE PYTHON Overview TargetCRM is CRM software company that sends emails to users every week. The system has a mai
Posted: Mon May 09, 2022 7:17 am
PLEASE USE PYTHON Overview TargetCRM is CRM software company
that sends emails to users every week. The system has a mailing
list of users that are active and able to receive emails as well as
users that have been unsubscribed. Users that opt out are removed
from the database and the remaining users are considered active and
keep receiving emails. A recurrent issue that TargetCRM has is with
computing statistics to analyze the actual state of the mailing
list. Because of incorrect user input, there are a lot of email
addresses in the mailing list in the wrong format, which is
undesirable for the company. The CTO (Chief Technology Officer) of
the company wanted to generate a report to check the quality of the
email address data. However, when they aggregated the data to
create the report, they noticed that the data quality was not good.
As more people started using the application, more erroneous email
address entries were found. As the system uses email to contact the
end users, that was found to be a critical issue. Take into account
that each user with an incorrectly formatted email address in our
mailing list is a final customer to whom we cannot send
notifications because their email is unreachable. Knowing this, for
this project, TargetCRM wants to filter out these users in order to
free up storage space for valid users and, in the future, work
toward a way to improve overall data quality. Your solution will
allow TargetCRM to save a considerable amount of m
PLEASE USE PYTHON
Overview
TargetCRM is CRM software company that sends emails to users
every week. The system has a mailing list of users that are active
and able to receive emails as well as users that have been
unsubscribed. Users that opt out are removed from the database and
the remaining users are considered active and keep receiving
emails.
A recurrent issue that TargetCRM has is with computing
statistics to analyze the actual state of the mailing list. Because
of incorrect user input, there are a lot of email addresses in the
mailing list in the wrong format, which is undesirable for the
company.
The CTO (Chief Technology Officer) of the company wanted to
generate a report to check the quality of the email address data.
However, when they aggregated the data to create the report, they
noticed that the data quality was not good. As more people started
using the application, more erroneous email address entries were
found. As the system uses email to contact the end users, that was
found to be a critical issue.
Take into account that each user with an incorrectly formatted
email address in our mailing list is a final customer to whom we
cannot send notifications because their email is unreachable.
Knowing this, for this project, TargetCRM wants to filter out these
users in order to free up storage space for valid users and, in the
future, work toward a way to improve overall data quality.
Your solution will allow TargetCRM to save a considerable amount
of money from its monthly budget, because your program will free up
a team in the company from doing manual checks every week to filter
out incorrectly formatted email address data. Manually checking for
errors is a very tedious and repetitive task that is costly and
prone to failure, which is why automating this process is crucial
for the company.
After finishing this project, your solution will be integrated
into the mailing list updater application, and the TargetCRM
platform will be updated into a newer version with this
functionality.
What you need to do
Best Practices
Best Practices to Follow:
Create EmailNotValidError()
Create a user-defined exception called EmailNotValidError() that
will be thrown when an invalid email is found. This exception must
extend a base exception class to work. Your exception will also
accept a message that will be passed by the other functions. This
message helps programmers to understand what sort of error happened
while debugging the code.
Create is_email_valid()
Create a function called is_email_valid() that is going to be
responsible for checking whether an email address is valid or not.
This function will receive the updated mailing_list, and for each
email, you will check whether the address contains
an @ symbol. If the email does not
contain the symbol, your code should raise your custom-defined
exception.
TASK
Create the function is_email_valid to check whether an email
address is valid or not. If the address is not, the function throws
the custom EmailNotValidError exception.
Make it your own
This part involves techniques to solve real-world problems
such as pattern string matching, you will need this ability as a
software developer.
In the bonus project, we will challenge you to solve problems
that are real-world demands in software development. For the first
part, you will work with a text pattern-matching technique called
regular expressions (or regexes). Then, for the second part, you
will create a second user-defined exception that will be used to
validate whether a user’s email contains a provider that is
blacklisted (this blacklist is based on length, explained later in
part 2) due to characteristics in the name.
Your solution has been working well for a couple of weeks. After
a few iterations, the TargetCRM’s CTO has collected the mailing
list data again to check whether the email address data quality has
increased. As expected, the number of incorrect emails has
decreased significantly, and the company is happy with the initial
improvements that your contribution has brought to the
platform.
However, after a few weeks of iterations and tests, the CTO
realized that far more complicated errors were present in the email
address data that a simple @ symbol
check was not covering. They then realized that the feature was not
capturing the more complex incorrect format inputs.
Here are a couple of the bad email formats that were passed in
the initial validation:
These invalid email addresses had incorrectly passed our current
validation. Therefore, to create a more robust validator, we will
need to implement a more complex technique. Implement a solution
using regexes to create a more complex email validator.
Python has a built-in library for working with regexes, called
re.. Take a look at the regex patterns to validate email
formats.
After some time running your code, the company found out that
the system is sending notifications to users that contain emails
with reportedly malicious providers. Your solution is not
accounting for that right now, and so you will need to implement
this functionality.
The company identified a pattern in the email addresses that
were reported to be spammers/malicious. They found that a provider
with less than five characters is probably an email provider to
whom we want to avoid sending notifications.
Create a second custom exception to be raised when an email
address with a provider with less than five characters is
found.
Once you have filtered the emails that are invalid (in our
scenario), return the IDs of the users that are able to receive our
communication.
Once you have the email address as an entire string value, you
will need to extract only the provider fraction to compute its
length. Try to find a way to split the string by a common
delimiter, then retrieve only the part you are interested in, which
is the provider
Finally the MAIN.PY FILE
class EmailNotValidError():
""" Raised when the target email is not
valid """
def is_email_valid(mailing_list):
"""
Your docstring documentation
starts here.
For more information on how
to proper document your function, please refer to the official
PEP8:
https://www.python.org/dev/peps/pep-000 ... on-strings.
"""
for key, email in # Loop through the mailing
list:
if '@' not in # Check if the
email contains an @:
raise #
Raise an EmailNotValidError exception if the @ is not present
def is_email_valid_extended(mailing_list):
"""
Your docstring
documentation starts here.
For more information
on how to proper document your function, please refer to the
official PEP8:
https://www.python.org/dev/peps/pep-000 ... on-strings.
"""
final_users_list = # Array to hold user
ids
# Inserted a try.., except.. block to
cast the exception
try:
# Loop through
the mailing list
for key, email
in # Your mailing list:
if
'@' in # Check if the @ is present in the email:
#
Append the id of users with valid emails
else:
raise
# Raises an EmailNotValidError otherwise
except # Your user-defined
exception:
return # Return
a user-friendly message to cast the exception
def is_email_valid_extended_finally(mailing_list):
"""
Your docstring
documentation starts here.
For more information
on how to proper document your function, please refer to the
official PEP8:
https://www.python.org/dev/peps/pep-000 ... on-strings.
"""
final_users_list = # Array to hold user
ids
# Inserted a try.., except.. block to
cast the exception
try:
# Loop through
the mailing list
for key, email
in # Your mailing list:
if
'@' in # Check if the @ is present in the email:
#
Append the id of users with valid emails
else:
raise
# Raises an EmailNotValidError otherwise
except # Your user-defined
exception:
# Print a
user-friendly message to cast the exception
finally:
return # Return
the id of the users with valid email
that sends emails to users every week. The system has a mailing
list of users that are active and able to receive emails as well as
users that have been unsubscribed. Users that opt out are removed
from the database and the remaining users are considered active and
keep receiving emails. A recurrent issue that TargetCRM has is with
computing statistics to analyze the actual state of the mailing
list. Because of incorrect user input, there are a lot of email
addresses in the mailing list in the wrong format, which is
undesirable for the company. The CTO (Chief Technology Officer) of
the company wanted to generate a report to check the quality of the
email address data. However, when they aggregated the data to
create the report, they noticed that the data quality was not good.
As more people started using the application, more erroneous email
address entries were found. As the system uses email to contact the
end users, that was found to be a critical issue. Take into account
that each user with an incorrectly formatted email address in our
mailing list is a final customer to whom we cannot send
notifications because their email is unreachable. Knowing this, for
this project, TargetCRM wants to filter out these users in order to
free up storage space for valid users and, in the future, work
toward a way to improve overall data quality. Your solution will
allow TargetCRM to save a considerable amount of m
PLEASE USE PYTHON
Overview
TargetCRM is CRM software company that sends emails to users
every week. The system has a mailing list of users that are active
and able to receive emails as well as users that have been
unsubscribed. Users that opt out are removed from the database and
the remaining users are considered active and keep receiving
emails.
A recurrent issue that TargetCRM has is with computing
statistics to analyze the actual state of the mailing list. Because
of incorrect user input, there are a lot of email addresses in the
mailing list in the wrong format, which is undesirable for the
company.
The CTO (Chief Technology Officer) of the company wanted to
generate a report to check the quality of the email address data.
However, when they aggregated the data to create the report, they
noticed that the data quality was not good. As more people started
using the application, more erroneous email address entries were
found. As the system uses email to contact the end users, that was
found to be a critical issue.
Take into account that each user with an incorrectly formatted
email address in our mailing list is a final customer to whom we
cannot send notifications because their email is unreachable.
Knowing this, for this project, TargetCRM wants to filter out these
users in order to free up storage space for valid users and, in the
future, work toward a way to improve overall data quality.
Your solution will allow TargetCRM to save a considerable amount
of money from its monthly budget, because your program will free up
a team in the company from doing manual checks every week to filter
out incorrectly formatted email address data. Manually checking for
errors is a very tedious and repetitive task that is costly and
prone to failure, which is why automating this process is crucial
for the company.
After finishing this project, your solution will be integrated
into the mailing list updater application, and the TargetCRM
platform will be updated into a newer version with this
functionality.
What you need to do
Best Practices
Best Practices to Follow:
Create EmailNotValidError()
Create a user-defined exception called EmailNotValidError() that
will be thrown when an invalid email is found. This exception must
extend a base exception class to work. Your exception will also
accept a message that will be passed by the other functions. This
message helps programmers to understand what sort of error happened
while debugging the code.
Create is_email_valid()
Create a function called is_email_valid() that is going to be
responsible for checking whether an email address is valid or not.
This function will receive the updated mailing_list, and for each
email, you will check whether the address contains
an @ symbol. If the email does not
contain the symbol, your code should raise your custom-defined
exception.
TASK
Create the function is_email_valid to check whether an email
address is valid or not. If the address is not, the function throws
the custom EmailNotValidError exception.
Make it your own
This part involves techniques to solve real-world problems
such as pattern string matching, you will need this ability as a
software developer.
In the bonus project, we will challenge you to solve problems
that are real-world demands in software development. For the first
part, you will work with a text pattern-matching technique called
regular expressions (or regexes). Then, for the second part, you
will create a second user-defined exception that will be used to
validate whether a user’s email contains a provider that is
blacklisted (this blacklist is based on length, explained later in
part 2) due to characteristics in the name.
Your solution has been working well for a couple of weeks. After
a few iterations, the TargetCRM’s CTO has collected the mailing
list data again to check whether the email address data quality has
increased. As expected, the number of incorrect emails has
decreased significantly, and the company is happy with the initial
improvements that your contribution has brought to the
platform.
However, after a few weeks of iterations and tests, the CTO
realized that far more complicated errors were present in the email
address data that a simple @ symbol
check was not covering. They then realized that the feature was not
capturing the more complex incorrect format inputs.
Here are a couple of the bad email formats that were passed in
the initial validation:
These invalid email addresses had incorrectly passed our current
validation. Therefore, to create a more robust validator, we will
need to implement a more complex technique. Implement a solution
using regexes to create a more complex email validator.
Python has a built-in library for working with regexes, called
re.. Take a look at the regex patterns to validate email
formats.
After some time running your code, the company found out that
the system is sending notifications to users that contain emails
with reportedly malicious providers. Your solution is not
accounting for that right now, and so you will need to implement
this functionality.
The company identified a pattern in the email addresses that
were reported to be spammers/malicious. They found that a provider
with less than five characters is probably an email provider to
whom we want to avoid sending notifications.
Create a second custom exception to be raised when an email
address with a provider with less than five characters is
found.
Once you have filtered the emails that are invalid (in our
scenario), return the IDs of the users that are able to receive our
communication.
Once you have the email address as an entire string value, you
will need to extract only the provider fraction to compute its
length. Try to find a way to split the string by a common
delimiter, then retrieve only the part you are interested in, which
is the provider
Finally the MAIN.PY FILE
class EmailNotValidError():
""" Raised when the target email is not
valid """
def is_email_valid(mailing_list):
"""
Your docstring documentation
starts here.
For more information on how
to proper document your function, please refer to the official
PEP8:
https://www.python.org/dev/peps/pep-000 ... on-strings.
"""
for key, email in # Loop through the mailing
list:
if '@' not in # Check if the
email contains an @:
raise #
Raise an EmailNotValidError exception if the @ is not present
def is_email_valid_extended(mailing_list):
"""
Your docstring
documentation starts here.
For more information
on how to proper document your function, please refer to the
official PEP8:
https://www.python.org/dev/peps/pep-000 ... on-strings.
"""
final_users_list = # Array to hold user
ids
# Inserted a try.., except.. block to
cast the exception
try:
# Loop through
the mailing list
for key, email
in # Your mailing list:
if
'@' in # Check if the @ is present in the email:
#
Append the id of users with valid emails
else:
raise
# Raises an EmailNotValidError otherwise
except # Your user-defined
exception:
return # Return
a user-friendly message to cast the exception
def is_email_valid_extended_finally(mailing_list):
"""
Your docstring
documentation starts here.
For more information
on how to proper document your function, please refer to the
official PEP8:
https://www.python.org/dev/peps/pep-000 ... on-strings.
"""
final_users_list = # Array to hold user
ids
# Inserted a try.., except.. block to
cast the exception
try:
# Loop through
the mailing list
for key, email
in # Your mailing list:
if
'@' in # Check if the @ is present in the email:
#
Append the id of users with valid emails
else:
raise
# Raises an EmailNotValidError otherwise
except # Your user-defined
exception:
# Print a
user-friendly message to cast the exception
finally:
return # Return
the id of the users with valid email