Packages are namespaces which contain multiple packages and modules themselves. They are simply directories, but with a twist. Each package in Python is a directory which MUST contain a special file called __init__.py. This file can be empty, and it indicates that the directory it contains is a Python package, so it can be imported the same way a module can be imported. Python can be extended via libraries to allow data scientists to tackle machine learning, data analysis, and beyond.
The growth of Python in data science has gone hand in hand with that of Pandas, External link which opened the use of Python for data analysis to a broader audience by enabling it to deal with row-and-column datasets, import CSV files, and much more.
While Pandas may be the best-known library, there are hundreds of specialized libraries that serve an important purpose, few of them are:
- Sympy (FOR STATISTICAL APPLICATIONS)
- Pymc (MACHINE LEARNING)
- matplotlib (PLOTTING AND VISUALIZATION)
- pytables (STORAGE AND DATA FORMATTING)
- Numpy (ADVANCE MATH FUNCTIONALITIES)
- Requests (HTTP LIBRARY, MUST HAVE FOR EVERY PYTHON DEVELOPER)
- Scrapy (WEBSCRAPING)
- wxPython (GUI TOOLKIT FOR PYTHON)
- BeautifulSoup (XML AND HTML PARSING LIBRARY)
- SciPy (ALGORITHMS AND MATHEMATICAL TOOLS FOR PYTHON)
- nltk (MANIPULATE STIRNGS)