Kai būsiu Vilniuje…

Rugpjūčio 5 d. būsiu Vilniuje ir kartu su Investuotojų asociacija planuoju pristatymą apie kiekybinius finansus (quantitative finance). Jei norite sudalyvauti, galite parašyti man arba kreiptis tiesiai į Investuotojų asociaciją.

Trumpas pristatymas.

Bandant nustatyti arba prognozuoti turto vertę dažniausiai remiamasi viena iš trijų analizės rūšių: fundamentine, technine arba kiekybine analize. Lietuvoje instituciniai investuotojai dažniausiai remiasi fundamentine analize, o privatūs investuotojai naudoja tiek fundamentinę, tiek techninę analizę.  Nepaisant to, kad trečiąja analize paremtas investavimas yra labai populiarus pasauliniu mastu, Lietuvoje tokių investuotojų tėra vienetai.

Tik neseniai investiciniams investuotojams sukurta teisinė bazė pagaliau suteikia galimybę steigti rizikos fondus ir investuoti remiantis kiekybine analize. Bet dėl žmogiškųjų resursų stokos, IT infrastruktūros, patirties stokos, didelio institucinių investuotojų susidomėjimo rizikos fondais nereiktų tikėtis.

Nors privatūs investuotojai naudoja platesnį analizės rūšių spektrą, kiekybinė analizė nepopuliari dėl kitų priežasčių. Pasaulyje, ir ypač Lietuvoje labai įsigalėjęs pseudo-mokslas – faktais ar bandymais nepagrįsti investavimo metodai. Galima suprasti privačius investuotojus – įvairūs “guru” siūlo pasakiškus pelnus, jei tik investuotojas laikysis abstrakčių taisyklių arba seks kiekvieną “guru” žingsnį (žinoma, už mokestį). Norint išbandyti tam tikrą strategiją reikia laiko, žinių ir noro tobulėti. Ar kiekvienas gali paaukoti savo savaitgalį bandydamas suprasti Black–Scholes modelį, kuris tiesiogiai investavimo rezultatų nepagerins?

Per susitikimą sieksiu parodyti, kad kiekybinės analizės pradmenys nėra sudėtingi, o įgytos žinios padės kitus investavimo būdus naudojantiems investuotojams. Manau, kad kiekvienam investuotojui būtina žinoti, kas yra turto grąža, rizika ar grąžos skirsnis (tai pamatyti bent vieną kartą yra būtina). Taip pat pateiksiu kelis statistinius faktus, kurie plačiai žinomi akademiniame ir institucinių investuotojų pasaulyje, bet retai minimi privačių investuotojų. Pristatysiu kelias technine analize ir portfelio diversifikavimu paremtas investavimo strategijas, kurių veiksmingumas patikrintas statistine analize. Pabaigoje pakalbėsiu apie strategijų kūrimą, vystymą, testavimą ir kokią programinę įrangą naudoti tokiems tikslams.

Pirmosios mano investicijos buvo pasyvios – beveik visos įmonės Lietuvoje augo ir grąža iš investicijų tik didėjo. Po kelerių metų pradėjau domėtis technine ir fundamentine analize. Techninė analizė atrodė daug žadanti – didžioji dalis mokslinės literatūros apie techninę analizę parašyta taikant tą patį šabloną – aprašyti neva veikiantys šablonai, pateikti grafiniai pavyzdžiai ir “gudrūs” samprotavimai, kodėl tai turi veikti. Turimas bakalauro diplomas informacinių technologijų srityje ir 10 metų patirtis šioje srityje padėjo automatizuoti ir patikrinti nemažą dalį technine analize paremtų strategijų. Tai darydamas priėjau prie svarbios išvados: didžioji dalis technine analize paremtų strategijų statistiškai nereikšmingos. Tai pastūmėjo daugiau laiko skirti statistikai, duomenų analizei (data mining) ir kiekybinei analizei (quantitative analysis).

Mokslinių žinių bagažą finansų srityje įgijau ir gilinau studijuodamas Liuksemburgo ir Niujorko (NYU) universitetuose, kur man buvo suteiktas Finansų ir Bankininkystės magistro laipsnis.

Comments (1)

R package Blotter

How many times have you been disappointed by nice trading system, because neither trading cost or slippage or bid/ask spread were included into back-test results? Did you find difficult to back-test a portfolio in R or many portfolios with different stocks? Blotter package is supposed to solve these problems.

In really – it is complicated. I spent a couple of days to start using it. There was one bug in the demo example, but the rest was in my code. You should remember:
1. Time zone must be specified:

Sys.setenv(TZ="GMT")

2. xts object must be created with explicit index class:

getSymbols('^GSPC',from='2000-01-01',<strong>index.class=c("POSIXt","POSIXct")</strong>)

3. When the test is conducted all blotter related values are written into .blotter environment. If you want repeat the same test, then you need to get rid of all .blotter values. So, you have to run something like this:

rm(list=ls(envir=.blotter),envir=.blotter)

You can find the author’s comment about this issue here: http://n4.nabble.com/Blotter-package-problem-with-example-tp1018634p1572911.html
4. It is slow. If you want to find the fittest rule from a bunch of them – try avoid using blotter, instead, use pure cumulative return and later on, include blotter.

Let’s test this rule with blotter as an example:
short SPY if today’s low is higher than yesterday’s close and close next day.

require(xts)
require(quantmod)
Sys.setenv(TZ="GMT")
initDate='2000-01-01'
getSymbols('SPY',from=initDate,index.class=c("POSIXt","POSIXct"))
SPY<-adjustOHLC(SPY,use.Adjusted=T)
 
#yesterday's price
tmp<-lag(SPY,1)
 
# if today's low is higher than yesterday's close 1, else 0
signal<-ifelse(Lo(SPY)>Cl(tmp),1,0)
signal[1]<-0
 
#let's plot cumulitative return to make sure, that we are on the right path
tmp<-lag(Delt(Cl(SPY)),-1)
tmp[length(tmp)]<-0
plot(cumprod(signal*tmp+1))
 
SPY<-Cl(SPY)
symbols<-c('SPY')
 
rm(list=ls(envir=.blotter),envir=.blotter)
initPortf(ltportfolio,symbols, initDate=initDate)
initAcct(ltaccount,portfolios=c(ltportfolio), initDate=initDate,initEq=initEq)
 
signal<-signal['2000-02-01::']
 
for(i in 1:length(signal))
{
	currentDate= time(signal)[i]
	equity = getEndEq(ltaccount, currentDate)
	position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)	
 
	if(position==0)
	{
		#open a new position if signal is >0
		if(signal[i]>0)
		{
			closePrice<-as.double(Cl(SPY[currentDate]))
			unitSize = as.numeric(trunc((equity/closePrice)))
                        # 5$ per transaction
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = -unitSize , TxnFees=5, verbose=T)
		}
 
	}
	else
	{
		#position is open. If signal is 0 - close it.
		if(as.double(signal[i])==0 &amp;&amp; position<0)
		{
			position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)
			closePrice<-as.double((Cl(SPY[currentDate])))#as.double(get(symbols[1])[i+100])
 
                        # 5$ per transaction
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = -position , TxnFees=5, verbose=T)
 
		}
 
	}
	updatePortf(ltportfolio, Dates = currentDate)
	updateAcct(ltaccount, Dates = currentDate)
	updateEndEq(ltaccount, Dates = currentDate)
 
	equity = getEndEq(ltaccount, currentDate)
 
}
plot(getAccount(ltaccount)[["TOTAL"]]$End.Eq)

The result (it will take time to get it…):
Photobucket

Comments

Predicting April month return

Bespoke blogged about average monthly returns of the DJI and emphasized April. Before jumping on that information, let’s check some weak points.
In that post, only average returns are presented. We need at least extreme points (min;max) and confidence ranges. Second problem – the normal market have upward trend and we need to get rid of that. To do so, either we have subtract the  rolling mean (a tough way) or use logarithmic prices (that’s the easy way!).

Instead of DJI, I took S&P500 from 1950 until now.

Photobucket

Sure, average return in April is above 0, but based on historical data, negative return is possible. I conducted t-test, where null hypothesis was, that average return is equal to zero. I got p-value of 0.0042, so null hypothesis can be rejected (return is above 0).

The graph below shows cumulative return of investment, investing only in April. Keep in mind, that this is log scale and real return would be higher.
Photobucket

Conlusion: based on this data, expect positive return in April.

R code

require(xts)
require(quantmod)
require(ggplot2)
getSymbols('^GSPC',from='1950-01-01')
return<-Delt(Cl(to.monthly(log(GSPC))))
return[1]<-0
temp<-as.double(format(index(return),'%m'))
 
temp<-data.frame(as.double(return),as.numeric(temp))
qplot(factor(as.numeric(temp[,2])),as.double(return),data=temp,geom = "boxplot",ylab='Returns',xlab='Months')
 
t.test(return[which(format(index(return),'%m')=='04')])
plot(cumprod(return[which(format(index(return),'%m')=='04')]+1),main='Cumulative return, 1950-present')

Comments

Parex bonds

Latvia to divide Parex Banka into two
Regardless of this news my broker sold Parex bank’s bonds.
Let’s make simple calculation. This bond has 2 outstanding coupons, worth of 5.6%. Total payoff will be approximately 1115 Eur. Somebody paid 980 Eur and is happy with 13.7% premium. Good premium, but downside risk is -100%.
Somebody is very well informed :)

Comments

Returns on Easter week and one week after

Inspired by CXO group report, I did a rerun of the same strategy on my data. Easter’s dates can be find at wikipedia. Overall, my results are similar to CXO group’s results.

In the graph below, I plotted daily returns on Easter week (Monday to Thursday) from 1982 to 2009. I prefer this way of showing things, where the range, minimum, maximum and mean of returns are presented.
It is clear, that only returns on Thursdays are above zero and t-test confirms that:
t = 2.235, df = 27, p-value = 0.03389. The rest is close to random.

Photobucket

The graph below shows daily returns one week after Easter holidays. Although Monday looks like negative day, but it lacks significance:
t = -1.1517, df = 27, p-value = 0.2595 (should be less that 0.1).
The rest is noise.

Photobucket

In summary, only returns on Thursdays have positive bias.

#R-Language code to repeat this test.
require(quantmod)
require(xts)
easter<-as.matrix(read.table(file='data/easter.csv'))
getSymbols('^GSPC',from='1980-01-01')
 
GSPC.delta<-Delt(log(Cl(GSPC)))
GSPC.delta<-Delt(Cl(GSPC))
 
#returns during the week before Easter
#returns on Thursdays
nasa<-data.frame(as.double(GSPC.delta[(as.Date(easter,'%m/%d/%y')-3)]),4)
colnames(nasa)<-c('ret','weekday')
 
#Monday to Wednesday
for(i in 1:3)
{
tmp<-data.frame(as.double(GSPC.delta[(as.Date(easter,'%m/%d/%y')-(3+i))]),(4-i))
colnames(tmp)<-c('ret','weekday')
rbind(nasa,tmp)
nasa<-rbind(nasa,tmp)
}
t.test(nasa$ret[nasa$weekday==4])
 
require(ggplot2)
qplot(factor(as.numeric(nasa$week)),as.double(nasa$ret),data=nasa,geom = "boxplot",ylab='Returns',xlab='Weekdays')
 
#returns during the week after Easter
 
GSPC.delta<-Delt(Cl(GSPC))
nasa<-data.frame(as.double(GSPC.delta[(as.Date(easter,'%m/%d/%y')+1)]),1)
colnames(nasa)<-c('ret','weekday')
for(i in 2:5)
{
print(i)
tmp<-data.frame(as.double(GSPC.delta[(as.Date(easter,'%m/%d/%y')+i)]),i)
colnames(tmp)<-c('ret','weekday')
rbind(nasa,tmp)
nasa<-rbind(nasa,tmp)
}
t.test(nasa$ret[nasa$weekday==1]);t.test(nasa$ret[nasa$weekday==2])
qplot(factor(as.numeric(nasa$week)),as.double(nasa$ret),data=nasa,geom = "boxplot",ylab='Returns',xlab='Weekdays')

Comments

Strategy: what if SPY & VIX are up?

Recently, I was busy testing the following strategy:

If SPY and VIX daily returns are positive, then short SPY at close and keep it for one day.

The strategy is dump simple and it has very good feature – short side. There are not so many successful short side strategies. For testing purpose I used daily Yahoo data from 1995 until present. For commissions and bid/ask spread I used 5 $ fee per 10 000 $ trade. Here we go:

Photobucket

Annualized Return                     0.0421
Annualized Std Dev                   0.0488
Annualized Sharpe (Rf=0%)        0.8621
t = 3.3787, df = 3811, p-value = 0.0007356

Up to this point is was relatively easy to make a test (the true is, that I spent some time cracking and hacking blotter package, but I will write about it in separate post). My second objective was the improvement of this strategy.

One of the way to understand the strategy is to look how the components are related to each other or correlated. To do that, I took daily returns of SPY and VIX at day 0 and plotted against SPY next day’s (day+1) returns.

Photobucket

What can I tell by looking at this plot? I couldn’t figure out any linear relation between returns of SPY and VIX at day+0 and returns of SPY at day+1. Should I try something like random forest method?

I tried to add some TA flavors, like RSI, but the improvements were not very significant. Simplicity is genius!

Comments

End of the month investment

It is know, that the first day of the month provides bullish edge. According to Quantifiable edges not all the months are equal. So, I made a test on S&P500 index, from January, 1980 until February, 2010. It is true, March isn’t the best month to run this strategy.

Photobucket

Only 3 months have significant results based on p-values:
“month 5, p-value 0.0399233570186162″
“month 7, p-value 0.0466800163648646″
“month 11, p-value 0.0218919220125013″

p.s. if somebody is interested in R-Language code to repeat this test, then let me know.

Comments

Nedarbas Lietuvoje

Šiuo įrašu noriu pademonstruoti R-Language galimybes – kaip panaudojant įvairias duomenų bazes galima sugeneruoti gražius grafikus, kurie turėtų padėti analitikams.
Photobucket

R-Language kodas:

#Lietuvos žemėlapį parsisiunčiam iš čia: http://gadm.org/data/rda/LTU_adm1.RData
#statistinius duomenis iš čia: http://db1.stat.gov.lt/statbank/default.asp?w=1274</code>
 
load(file='LTU_adm1.RData')
tmp&lt;-read.table(file='nedarbas.apskr.csv',sep=';')
nasa.factor&lt;-as.factor(as.numeric(cut(tmp$V2, c(0,9,10,11,12))))
 
levels(nasa.factor)&lt;- c("&lt;9%", "9-10%", "10-11%","11-12")
gadm$col_no&lt;-nasa.factor
 
myPalette&lt;-brewer.pal(4,"Purples")
jpeg(file='nedarbas.Lietuvoje.jpg',width=500)
spplot(gadm, "col_no", col=grey(.9), col.regions=myPalette,main='Nedarbas Lietuvoje')
dev.off()

Šaltinis: http://ryouready.wordpress.com/2009/11/16/infomaps-using-r-visualizing-german-unemployment-rates-by-color-on-a-map/

Comments (1)

Gas price seasonality

Last spring I read “Quantitative Trading” by Ernest P. Chan. In his book, he suggested to buy gas futures contract at the end of February and sell it later, in March. Today, I decided to test this strategy by using R-language.
The most important thing for such investigation is data. For this purpose, I used this public data: www.eia.doe.gov
I made 2 test – for the first one, I used futures contract, which will be settled after 4 months and for the second one, I used gas spot price.
In the first plot, we can see monthly price returns scaled by months (1-12). It is clear, that mean of March’s returns are above 0. What is encouraging in this plot is that the ranges of returns are above 0 as well (meaning, that majority of March’s returns was above 0).
Anova test gives p-value below 0.01 – some months in the group has different mean (that supports seasonality idea). March’s t-test gives these values: t = 2.8064, df = 15, p-value = 0.01329.

Photobucket

I used gas spot prices to generate second plot. It is similar to the first one, except that March’s returns has longer tail. It is worth to note, that September stands as a positive month as well. The statistics for this test are not so strong, as it was in the first example.
P-value of Anova test is only 0.11 and March’s t-test is:
t = 1.3791, df = 15, p-value = 0.1881

Photobucket

R-language code to run these tests:

#------gas contract 4 -----
library(xts)
library(ggplot2)
library(quantmod)
 
tmp<-as.matrix(read.table(file='gas.contract.4.csv',sep=';'))
gas<-as.xts(as.double(tmp[,2]),order.by=as.POSIXct(tmp[,1]))
plot(gas)
gas.monthly<-to.monthly(gas)
gas.monthly.delta<-Delt(Cl(gas.monthly))
gas.monthly.delta[1]<-0
 
gas.factor<-as.double(format(index(gas.monthly.delta),'%m'))
tmp<-data.frame(as.double(gas.monthly.delta),as.numeric(gas.factor))
qplot(factor(as.numeric(gas.factor)),as.double(gas.monthly.delta),data=tmp,geom = "boxplot",ylab='Returns',xlab='Months')
 
anova(lm(as.double(gas.monthly.delta)~gas.factor))
t.test(((gas.monthly.delta[which(format(index(gas.monthly.delta),'%m')=='03')]))[,1])
 
#----gas contract spot -------
tmp<-as.matrix(read.table(file='gas.csv',sep=';'))
gas<-as.xts(as.double(tmp[,2]),order.by=as.POSIXct(tmp[,1]))
 
plot(gas)
colnames(gas)<-c('Close')
 
gas.monthly<-to.monthly(gas)
gas.monthly.delta<-Delt(Cl(gas.monthly))
gas.monthly.delta[1]<-0
 
gas.factor<-as.double(format(index(gas.monthly.delta),'%m'))
tmp<-data.frame(as.double(gas.monthly.delta),as.numeric(gas.factor))
 
qplot(factor(as.numeric(gas.factor)),as.double(gas.monthly.delta),data=tmp,geom = "boxplot",ylab='Returns',xlab='Months')
 
anova(lm(as.double(gas.monthly.delta)~gas.factor))
t.test(((gas.monthly.delta[which(format(index(gas.monthly.delta),'%m')=='03')]))[,1])

Comments (4)

Sugar price seasonality

Recently, Orion securities have issued a “BUY” recomendation for Cugar ETF. Because, neither I follow the recommendations nor I’m big fan of TA (I have to admit, that I was…), I decided to check sugar price seasonality.

Voila, the mean of monthly returns are presented in the graph. February, April and May tend to be negative and June and July show positive returns.
BUT! Don’t forget to ask – are these results significant? P-value for July is 14%, 34% for April. The rest is above 50%. So, keep in mind, that these results are very weak…

Photobucket

More descriptive plot:

Photobucket

Comments

« Previous entries Next Page » Next Page »