Interesting video: [Video] A Huge Debate: R vs. Python for Data Science (presentation by Eduardo Ariño de la Rubia)
Packages discussed:
- Collecting Data
- Feather (binary file format for storing data frames fast; interoperable with R and other languages, https://pypi.python.org/pypi/feather-format)
- IBIS (Python bridge to data sources including SQL databases and Spark; http://docs.ibis-project.org/)
- ParaText (reading text files very fast, https://github.com/wiseio/paratext)
- BCOLZ (column storage, https://github.com/Blosc/bcolz)
- Data Visualization
- Altair (static plots, https://github.com/altair-viz/altair, gallery: https://altair-viz.github.io/gallery/index.html)
- Bokeh (dynamic plots, http://bokeh.pydata.org/en/latest/, gallery: http://bokeh.pydata.org/en/latest/docs/gallery.html)
- GEOPlotLib (maps, https://github.com/andrea-cuttone/geoplotlib)
- Clean & Transform
- Blaze (efficiently dealing with large data, http://blaze.readthedocs.io/en/latest/index.html)
- Xarray (n-dimensional data, http://xarray.pydata.org/en/stable/)
- Dask (parallel computing, http://dask.pydata.org/en/latest/)
- Modeling
- Keras (deep learning, https://github.com/fchollet/keras)
- PYMC3 (Bayesian statistical modeling, https://pymc-devs.github.io/pymc3/)
- Collecting Data
- Feather (see also under Python; https://cran.r-project.org/web/packages/feather/index.html)
- Haven (interoperability with Stata, SPSS, SAS, https://cran.r-project.org/web/packages/haven/index.html)
- Readr (read csv and other text files fast, https://cran.r-project.org/web/packages/readr/index.html)
- JSONlite (read/write JSON files, https://cran.r-project.org/web/packages/jsonlite/index.html)
- Visualization
- GGPLOT2 (everyone knows this already; http://ggplot2.org/, gallery: http://www.r-graph-gallery.com/all-graphs/)
- HTMLWidgets (dynamic plots; http://www.htmlwidgets.org/)
- Leaflet (maps; https://rstudio.github.io/leaflet/)
- tilegamsR (tilegrams in R, http://blog.revolutionanalytics.com/2016/10/tilegrams-in-r.html)
- Clean & Transform
- dplyr (https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html)
- broom (makes results of fitting commands easier to access, ftp://cran.r-project.org/pub/R/web/packages/broom/vignettes/broom.html)
- Tidy_Text (text mining, http://tidytextmining.com/)
- Modeling
- MXNET (deep learning; https://github.com/dmlc/mxnet)
- TensorFlow (https://rstudio.github.io/tensorflow/)
No comments:
Post a Comment