Edgar can be used to analyse the financials of a single company over time.
However, we try to get an overall picture of the profits over all companies.

For this we use the Python Data Analysis Library Pandas and my simple Edgar REST API.

In [199]:
import pandas 
import matplotlib.pyplot as plt

Profits by Country

We can get a quick overview with the help of the follwing REST get request. We ask for countries by year and we limit the data to the first top 14 (+ 1 for all others) most important entries:

In [200]:
url = 'https://pschatzmann.ch/edgar/values?format=csv&col=country&row=year&parameter=NetIncomeLoss&n=15'
df = pandas.read_csv(url, sep=";", index_col=0)
df
Out[200]:
USA IRELAND BERMUDA UNITED KINGDOM LUXEMBOURG SINGAPORE NETHERLANDS SWITZERLAND UNKNOWN CAYMAN ISLANDS CHINA GEORGIA ARGENTINA PUERTO RICO [OTHERS]
year
2003 -3.033491e+08 NaN NaN NaN NaN NaN NaN -2.882404e+06 -7.300000e+04 NaN -8.197800e+04 -9.478040e+05 NaN NaN -5.562332e+06
2004 -4.301797e+08 NaN 9.087000e+03 NaN NaN NaN NaN -8.247950e+05 -7.752500e+04 NaN -7.552000e+03 -2.351828e+06 NaN NaN -1.312555e+07
2005 -7.105080e+08 NaN 4.727000e+03 -4.637600e+04 NaN NaN NaN -2.451203e+06 8.950560e+05 NaN -1.224500e+04 -1.611086e+06 NaN NaN -3.107430e+07
2006 -1.244381e+09 NaN 5.023000e+03 -1.780000e+04 NaN NaN NaN -3.998684e+06 -4.174400e+04 NaN -1.399570e+05 -5.841660e+05 NaN NaN -7.651370e+07
2007 2.142442e+10 NaN -9.990000e+02 -4.806500e+04 NaN NaN NaN -2.540723e+07 -2.449880e+05 -1.138400e+04 -4.919100e+05 -4.241796e+06 NaN NaN -1.237987e+08
2008 6.372951e+10 2.163260e+09 -1.657666e+09 3.243314e+09 NaN 2.800000e+07 1.100200e+08 1.527000e+08 1.777103e+08 2.211859e+08 1.700601e+08 -1.476841e+09 18811661.0 -1.243903e+09 -4.103107e+08
2009 1.248969e+11 3.601053e+09 1.129595e+10 5.469415e+09 -7.554788e+08 -2.126010e+08 2.055930e+08 5.789572e+08 -5.754954e+07 5.032624e+08 4.413066e+08 -1.212453e+09 66417586.0 -7.785250e+08 -3.271274e+08
2010 3.938061e+11 6.030840e+09 1.157711e+10 4.431566e+09 2.248439e+09 1.466548e+09 1.071588e+09 1.383534e+09 2.870082e+08 7.515504e+08 1.295407e+09 -2.189898e+09 112050038.0 -6.018380e+08 -3.182627e+09
2011 5.588024e+11 7.049756e+09 -3.643977e+08 4.240544e+09 4.312803e+09 2.454989e+09 1.139838e+09 3.163506e+08 1.449851e+09 6.835615e+08 1.219193e+09 -5.023214e+08 153592679.0 1.781683e+08 3.973782e+09
2012 4.249889e+11 1.082484e+10 1.184683e+10 5.763634e+09 6.311566e+09 2.194669e+09 1.908171e+09 9.369071e+08 1.116745e+09 8.182539e+08 4.791722e+08 4.227288e+08 202692294.0 4.314907e+08 -1.388279e+08
2013 5.694887e+11 4.538435e+09 1.472018e+10 7.143306e+09 6.077563e+09 1.426272e+09 2.244610e+09 6.568641e+08 1.091028e+09 1.446139e+09 7.669154e+08 5.312635e+09 235033263.0 1.078187e+09 -3.796084e+09
2014 5.862908e+11 1.203827e+10 1.489992e+10 -3.061614e+09 -2.344640e+08 9.424012e+08 2.215806e+09 9.703317e+08 1.280768e+09 5.690634e+08 7.469834e+07 1.034620e+09 72581000.0 2.340319e+08 4.771593e+09
2015 5.021062e+11 1.919700e+10 8.001704e+09 3.236015e+09 -1.697340e+08 6.137806e+09 -8.063580e+08 1.137712e+08 -7.866582e+08 -3.413128e+08 -6.529603e+08 9.874154e+08 105789000.0 1.049620e+09 -5.417799e+09
2016 5.704324e+11 2.288677e+10 7.920364e+09 6.632317e+09 -5.923000e+07 -4.831225e+09 -1.113112e+09 4.582820e+08 7.525393e+08 -2.014545e+08 -9.682752e+07 9.959203e+08 136366000.0 4.561614e+08 9.231391e+08
2017 1.307858e+11 6.784000e+09 8.310000e+08 4.306092e+08 NaN 3.195640e+08 -1.146450e+08 NaN -5.085480e+05 -3.215600e+04 2.439734e+07 -4.640800e+07 NaN NaN 4.639060e+08
In [201]:
df.plot.line(figsize=(20, 8))
Out[201]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f0df7b4ebe0>
 We have more then 15 years of data. For a comparison between years we should ignore the data before 2010 because we did not have enough companies filing before that year.

The US is – as expected – by far the biggest country. However it is interesting to see that the total profits of all Irish companies are bigger then then ones from Great Britain! 

The second thing to note is that companies from small countries seem to roll in bigger profits then the ones from other bigger countries: So you should have a look at companies from Bermuda, Luxemburg, Singapore, the Netherlands, Switzerland and the Cayman Islands. So this list does not match with the top 15 economies in size.

Overview by Industry

With the same approach we can also get an overview of the profits by industry sector. Here are the top 5 secotors:

In [202]:
url = 'https://pschatzmann.ch/edgar/values?format=csv&col=sicDescription&row=year&parameter=NetIncomeLoss&n=6'
sectors = pandas.read_csv(url,sep=";",index_col=0)
sectors.plot.bar( )
Out[202]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f0df7aff2e8>
In [216]:
sectors.iloc[-2].plot.pie(figsize=(5, 5))
Out[216]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f0df5b4aa58>

Over the last couple of years the Software industry was most profitable wheres the profits of the Oil industry were shrinking over time.

By States

Now we have a quick look at the most profitable states

In [217]:
url = 'https://pschatzmann.ch/edgar/values?format=csv&col=state&row=year&parameter=NetIncomeLoss&n=11'
states = pandas.read_csv(url, sep=";", index_col=0)
states.plot.line(figsize=(20, 8))
Out[217]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f0df5ba4d30>
In [214]:
states.iloc[-2].plot.bar(figsize=(5, 5))
Out[214]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f0df62a1128>

California is winning by far and Texas is the big looser. I first guess might be that Texas was not diversified enough to offset the shrinking returns in the oil industry.

Totals

And finally we look at the totals by year:

In [204]:
url = 'https://pschatzmann.ch/edgar/values?format=csv&col=&row=year&parameter=NetIncomeLoss&n=10'
total = pandas.read_csv(url,sep=";",index_col=0)
total.plot.bar(legend=False)
Out[204]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f0df77d7b00>

It looks strange that the current year is so small. So we look at the profits by month. Most filings are in December, so this is OK.

In [205]:
url = 'https://pschatzmann.ch/edgar/values?format=csv&col=&row=month&parameter=NetIncomeLoss&n=24'
total = pandas.read_csv(url,sep=";",index_col=0)
total.plot.bar(legend=False)
Out[205]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f0df7226d68>

Problems

In [120]:
url = 'https://pschatzmann.ch/edgar/values?format=csv&col=&row=unitRef&parameter=NetIncomeLoss'
units = pandas.read_csv(url,sep=";",index_col=0)
units
Out[120]:
Total
unitRef
USD 1.031272e+13
UNIT12 9.525348e+11
UNIT1 2.922792e+11
U000 2.061591e+11
U001 1.116411e+11
CAD 3.230752e+10
UNIT13 2.016197e+10
U002 9.713948e+09
UNIT_1 7.192145e+09
UNIT_2 6.877900e+09
UNIT_5 4.135174e+09
GBP 2.852600e+09
UNIT2 6.017896e+08
ISO4217-USD 4.608604e+08
U003 2.823310e+08
NRUCFC-20120531 1.150110e+08
MONETARY -2.038400e+04
USD_CAD -1.936817e+06
HKD -2.985000e+06
[OTHERS] -1.117144e+09

As we can see, we were summarizing different units of mesures.

In my next blog we show how to correct this.

 

 

Categories: Data ScienceEDGAR

Leave a Reply

Your email address will not be published. Required fields are marked *