I have been using R for a while as analytic platform. I also have used Python mainly for system level programming and "glue" R codes together. Recently, I started to looking into analytic power of Python in more details and would like to compare it with R.
R has great ecosystem, I am using Rstudio or ESS-emacs for programming environment, ggplot2 for plotting, foreach for parallel computing and knitr for writing reports. There are tons of packages available for various of tasks. The down side about R is that I cannot treat it as a system level programming language, it is awkward to do network communication, scheduling etc. The coding style of different packages could be totally different and the documents are not perfect and sometimes confusion.
Python has its own ecosystem. IPython is great! It is a protocol that could have many front end. I am using it through IPython notebook or elpy-Emacs package. Matplotlib makes plotting easy and its APIs are similar to matlab. I love ggplot2, but its syntax doesn't stick with me; I have to occasionally look it up. Pandas provides DataFrame similar to R. Statsmodels has statistic functions similar to R; although there are more to be done to match the power of R. Also, scikit-learn has many machine learning algorithms implemented in a uniform interface. Cython makes it easy to integrate C/C++ library with Python. I once linked a C/C++ library to R. That took me some time to familiarize with its syntax (RCpp package should make it easier but I haven't tried).
Overall, I like both of them. R has advantage of having large amount of packages while Python is more like a real programming language and its analytic abilities are growing fast.
No comments:
Post a Comment