A Ciência de Dados é um agente de mudança para as empresas
A Ciência de Dados é um campo interdisciplinar que combina aprendizado automático, estatísticas, análise avançada e programação. É uma nova forma de arte que destaca insights ocultos e coloca os dados para trabalhar na era da computação cognitiva.
IBM Data Science Experience (DSX) is an enterprise platform for data scientists and data engineers. It offers out-of-the-box open-source and commercial data science tools including RStudio, Apache Spark, Jupyter, and Zeppelin notebooks. DSX supports the entire data science lifecycle from data preparation and ETL to model development and deployment. With DSX, companies can build predictive and machine learning models using their favorite tools, technologies, and libraries, while leveraging the scale, security and governance of the HDP platform.
Ciclo de vida da ciência de dados
Access to community
DSX provides a social environment where data scientists can research and share articles, data sets, notebooks, and tutorials. DSX enables data scientists and analysts to come up to speed by taking courses in R, Python, or Scala, copy content into a Jupyter or a Zeppelin notebook, or work in an embedded RStudio environment.
With DSX, data scientists have the flexibility to create new Jupyter or Zeppelin notebooks in R, Python, or Scala or import an existing notebook. DSX includes popular open source libraries, such as PySpark, matplotlib, SparkML and machine learning and deep learning APIs. Data scientists can use DSX to tell a compelling story with the help of open source visualization libraries like Brunel and PixieDust and have the flexibility to install other open source libraries of their choice.
Code in Scala, Python, R, Apache Spark and SQL
Visualize and share code using Zeppelin & Jupyter Notebooks
Leverage RStudio IDE and Shiny
Use your favorite libraries including Scikit-learn, XGBoost, Spark Mlib, TensorFlow, Caffe, Keras and MXNet
The combination of HDP and DSX empowers enterprises to run data science at scale by leveraging all the data in the data lake, as well as deploying enterprise-grade security, governance, and operations.
Data Science at Scale - Run Spark Jobs on HDP Cluster
Este site usa cookies para análise, personalização e publicidade. Para saber mais ou alterar suas configurações de cookies, leia nossa Política de Cookies. Ao continuar a navegar, você concorda com o uso de cookies.
Apache, Hadoop, Falcon, Atlas, Tez, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie, Phoenix, NiFi, Nifi Registry, HAWQ, Zeppelin, Slider, Mahout, MapReduce, HDFS, YARN, Metron and the Hadoop elephant and Apache project logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States or other countries.