When Big Data Meet Phython?

articles

#1

WHY PYTHON FOR DATA ANALYSIS?

For many people, the Python language is easy to fall in love with. Since its first inception in 1991, Python has become one of the most popular dynamic programming languages, along with Perl, Ruby, and others. Python and Ruby have become especially popular in recent years for building websites by utilizing their numerous web frameworks, like Rails (Ruby) and Django (Python). Such languages are often called scripting languages as they can be used to write quick-and-dirty small programs, or scripts. I don’t like the term “scripting language” as it carries a commutation that they cannot be used for building mission-critical software. Among interpreted languages, Python is distinguished by its large and active scientific computing community. Adoption of Python for scientific computing in both industry applications and academic research has increased significantly since the early 2000s. For data analysis and interactive, exploratory computing and data visualization, Python will inevitably draw comparisons with many other domain-specific open source and commercial programming languages and tools in wide use, such as R, MATLAB, SAS, States and others. In recent years, Python’s improved library support (primarily pandas) has made it a strong alternative for data manipulation tasks. Combined with Python’s strength in general purpose programming, it is an excellent choice as a single language for building data-centric applications.

Python Training helps build the applications and usage of python for data analytics. It lets you work quickly and integrate systems more effectively. Its design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code.

Python as Glue

Part of Python’s success as a scientific computing platform is prominently its ease of integrating C, C++, and FORTRAN code. Most modern computing environments share a similar net of legacy like FORTRAN and C libraries for doing linear algebra, optimization, integration, fast Fourier transforms, and other such algorithms. The same story had held true for many companies and national labs that have used Python to glue together 3o years worth of legacy software. Most programs consist of small portions of code where most of the time is spent, with large amounts of “glue code” that doesn’t run often. In many cases, the execution time of the glue code is insignificant; effort is most fruitfully invested in optimizing the computational bottlenecks, sometimes by moving the code to a lower-level language like C. During the last few years, the Cython project (cython.org) has become one of the preferred ways of both creating fast compiled extensions for Python and also interfacing with C and C++ code.
For more check @ Python Tutorials

Integrated Development Environments (IDEs) :

When asked about my standard development environment, we almost always say IPython plus a text editor. We typically write a program and iteratively test and debug each piece of it in IPython. It is also useful to be able to play around with data interactively and visually verify that a particular set of data manipulations is doing the right thing. Libraries like pandas and NumPy are designed to the easy-to-use in the shell. However, some will still prefer to work in an IDE instead of a text editor. They do provide many nice “code intelligence” features like completion or quickly pulling up the documentation associated with functions and classes. Here are some that you can explore:

  • Eclipse with PyDev Plugin

  • Python Tools for Visual Studio (for Windows users)

  • PYChorm

  • Spyder

  • Komodo IDE

Do you have source regarding “When Big Data Meet Phython?” Feel free to discuss.