Phil Schatzmann

  • Blogs
    • Arduino
    • Data Science
    • Machine Sound
    • Machine Learning
    • Quantitative Trading
    • EDGAR
    • 3D Printing
    • Infrastructure
    • Other Topics
  • Projects
    • The Synthesis ToolKit (SKT) Library for Arduino
    • Investor
    • Smart EDGAR
    • OpenSCAD Kernel
    • News Digest
    • Zolldokumente.ch
    • SwissQR
  • Login
  • Subscribe

Data Science

Data Science

Smart-EDGAR is supporting Formulas now…

In the last Blog we demonstrated how we can calculate KPIs with the help of Spark. We have extended Smart Edgar functionality so that we can implement Calculated KPIs directly with the help of formulas. Here is a short demo in Scala which uses formulas the built in ‘coalesce’ method Read more…

By pschatzmann, 6 years26. December 2018 ago
Data Science

Processing 2.1 Mio Records from Solr in a Spark Cluster with BeakerX¶

I decided to build a repository of news headlines: I loaded all ‘New York Times’ headlines since the year 2000 and all Business related news from the ‘Guardian’ into the Solr Search engine. It has never been the intention to process all documents in one run but the goal was Read more…

By pschatzmann, 6 years18. December 2018 ago
Data Science

OpenNLP: Predicting Stock Movements from the News

In my last blog I demonstrated how to build a model that can predict if a stock is going up or down based on the news headlines using Spark MLLib. In this demo I will do the same – but with the help of OpenNLP. The solution consists of the following Read more…

By pschatzmann, 6 years17. December 2018 ago
Data Science

MLLib: Predicting Stock Movements from the News

In this blog we will demonstrate how we can predict if a stock is going up or down based on the news headlines. The solution consists of the following components: Spark MLLib (Machine Learning) My News-Digest (which I have described in my last blogs) Investor (we determine the stock prices Read more…

By pschatzmann, 6 years12. December 2018 ago
Data Science

The Decline of the New York Times – Producing Charts with Spark

After we have seen that processing large amounts of data with Spark is efficient, we will demonstrate how we can use a Spark DataFrame to generate Charts with Vegas-Spark! We display the number of all new York Times Articles and the Business Guardian Articles over time.  

By pschatzmann, 6 years12. December 2018 ago
Data Science

Processing 2.1 Mio Records from Solr in Scala

I decided to build a repository of news headlines: I loaded all ‘New York Times’ headlines since the year 2000 and all Business related News from the ‘Guardian’ into a Sorl Search engine. More details can be found in my prior blog. It has never been the intention to process Read more…

By pschatzmann, 6 years12. December 2018 ago
Data Science

Vegas in BeakerX

Today I spent some time to figure out how to use Vegas (which is a plotting library for Scala) in Jupyter with the BeakerX Scala kernel. Here is the result!

By pschatzmann, 6 years7. December 2018 ago
Data Science

News-Digest: Accessing the History of News Headlines¶

Recently I have spent some time to investigate the options to access the history of news articles via an API. I was mainly interested in APIs which can be accessed free of charge. Here is the list of the most useful providers: Guardian – Easy API – Acceptable Rate Limits Read more…

By pschatzmann, 6 years6. December 2018 ago
Data Science

DL4J Doc2Vec – Sentiment Analysis using Sentiment140

I am planning to use the DL4J Doc2Vec implementation for a sentiment analysis. However, I don’t want to start with an empty network but the staring point should be a pre-trained network: The initial trining should be done with the Sentiment140 dataset which can be found at https://www.kaggle.com/kazanova/sentiment140. It contains Read more…

By pschatzmann, 6 years2. December 2018 ago
Data Science

DL4J – Sentiment Analysis with SentiWordNet¶

The basic goal of a ‘sentiment analysis’ is to classify a given text into positive, negative or neutral. SentiWordNet is a lexical resource for opinion mining. It assigns sentiments to each synset of WordNet which makes it possible to “calculate” an overall sentiment for a text. A SentiWordNet implementation can Read more…

By pschatzmann, 6 years1. December 2018 ago

Posts pagination

Previous 1 2 3 Next
Phil Schatzmann
Rue du Biais 24 B
1957 Ardon
Switzerland

phil.schatzmann@gmail.com

Categories
3D 3D Printed Planes 3D Printing Arduino Data Science EDGAR Infrastructure LogicAnalyzer Machine Learning Machine Sound News Digest OpenSCAD Kernel Other Topics Pico Projects Quantitative Trading RC Text To Speech
RSS RSS
  • Remote Control for the Arduino AudioTools AudioPlayer
  • A Http Live Streaming (HLS) Player with the Arduino Audio Tools

Hestia | Developed by ThemeIsle