Archive for January, 2011

Tick data retrieval

I just published Java based code to pull tick data from Interactive Brokers. There are thousands tools to get tick data from IB, but I had one feature in mind.

You can get maximum 50 quotes per second from Interactive Brokers (its IB limitation for TWS API) . Imagine a situation, when there is a delay in swapping incoming information, because I\O process is very slow or a short overload of the system. In such case either some piece of data will be lost or the system will crash. OK, let’s say you have plenty of RAM and speedy hard drive. Does it make sense for real time trading system to write tick data into disk and then pass all information further? Can it be done asynchronous? Yes and Java Message Service was build for that.

So, my idea was to build a tool, which would grab the tick data from provider and pass it to JMS. Retrieval tool doesn’t care what happens next – disk crash, heavy processing of the data or save on the storage. On the other end – JMS can have one or many clients and it will pass all incoming information. If something happens to a client during the transfer of the information – JMS will take care of it – it will wait for fallen clients by preserving incoming information.

If you are looking for how to stick together JMS, ActiveMQ, Spring, Hibernate, JPA and Maven, then this code can help you as well.

Comments (2)

Interesting volatility measurement, part 2

A few weeks ago I have mentioned about an interesting volatility prediction. It is based on two periods of historical volatility (standard deviation). The remaining question was – does it really works? I could not give the answer, because I didn’t have VIX futures data at that time. Later on, I was contacted by Brian G. Peterson, who provided necessary data to finish this test. By the way, I just found, that CBOE shares VIX futures data on its website.

Now I want you to show, what are returns of VIX futures for the next 3 days, then historical volatility ratio of 3 days vs 10 days is less than 0.25:

Photobucket

?View Code RSPLUS
 
Sys.setenv(TZ="GMT")
require('xts')
require('quantmod')
require('blotter')
require('PerformanceAnalytics')
 
tmp<-as.matrix(read.table('tickers/various_day_close/VIXc1.csv',sep=',',header=TRUE))
vix<-as.xts(as.double(tmp[,9]),order.by=as.POSIXct(strptime(tmp[,2],'%d-%b-%Y'),tz='GMT'))
vix<-(vix[!is.na(vix)])
colnames(vix)<-c('Close')
 
 
tmp<-as.matrix(read.table('tickers/various_day_close/ESc1.csv',sep=',',header=TRUE))
es<-as.xts(as.double(tmp[,9]),order.by=as.POSIXct(strptime(tmp[,2],'%d-%b-%Y')))
es<-(es[!is.na(es)])
colnames(es)<-c('Close')
 
#-----------------data end-----------------
 
 
#-----------------signal-------------------
es.delta<-Delt(Cl(es))
delta<-Delt(Cl(vix))#Front contract
 
#Historical volatility during 3 and 10 days
short.vol<-as.xts(rollapply(es.delta,3,sd,align='right'))
long.vol<-as.xts(rollapply(es.delta,10,sd,align='right'))
 
past.vol<-short.vol/long.vol
future.vol<-lag(past.vol,-3)
future.delta<-lag(vix,-3)/vix-1
 
signal<-ifelse(past.vol<0.25,1,0)
 
#here we see, increase in historical volatility
summary(as.double(future.vol[index(signal[signal!=0])]))/summary(as.double(past.vol[index(signal[signal!=0])]))
 
#-----------------signal end-------------------
 
#--------------blotter code------------------
symbols<-c('vix')
 
initDate=time(get(symbols)[1])
initEq=50000
rm(list=ls(envir=.blotter),envir=.blotter)
ltportfolio='volatility'
ltaccount='volatility'
initPortf(ltportfolio,symbols, initDate=initDate)
initAcct(ltaccount,portfolios=c(ltportfolio), initDate=initDate,initEq=initEq)
currency("USD")
stock(symbols[1],currency="USD")
 
signal<-signal[index(vix)]
 
signal[is.na(signal)]<-0
 
counter<-0 #date counter - exit on 3th day
 
for(i in 2:length(signal))
{
	currentDate= time(signal)[i]
	equity = initEq #getEndEq(ltaccount, currentDate)
	position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)	
	print(position)
	print(currentDate)
	if(position==0 &counter==0)
	{		
		#open a new position if signal is >0
		if(signal[i]>0)
		{
			print('open position')
			closePrice<-as.double(get(symbols[1])[currentDate])
			print(closePrice)
			unitSize = as.numeric(trunc((equity/closePrice)))
			print(unitSize)
			commssions=-unitSize*closePrice*0.0003
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = unitSize , TxnFees=commssions, verbose=T)
			counter<-1
		}
 
	}
	else
	{
		#position is open. If signal is 0 - close it.
		if(position>0 & as.integer(signal[i])==0 &counter>=3)
		{
			position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)
			closePrice<-as.double(get(symbols[1])[currentDate])#as.double(get(symbols[1])[i+100])
			commssions=-position*closePrice*0.0003
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = -position , TxnFees=commssions, verbose=T)
			counter<-0
		}
		else
			counter<-counter+1
 
	}	
	print('>>>>>>>>>>>>')
	updatePortf(ltportfolio, Dates = currentDate)
	updateAcct(ltaccount, Dates = currentDate)
	updateEndEq(ltaccount, Dates = currentDate)
}
rez1<-(getPortfolio(ltaccount))
 
#--------------blotter code end------------------
 
#----------------results------------------------
png('vix_front.png',width=650)
#net profit - commissions, slipage excluded
chart.TimeSeries(cumsum(rez1$symbols$vix$txn[,7]),main='VIX front contract')
dev.off()
#----------------results end------------------------

The graph shows, that this strategy is pure random or just follows VIX index. Now let’s see, what are returns of this strategy, if S&P500 futures are used instead of VIX.

Photobucket

?View Code RSPLUS
 
signal<-ifelse(past.vol<0.25,1,0)
#signal<-signal[index(es)]
 
 
 
#------------------------blotter code-----------------------
symbols<-c('es')
 
initDate=time(get(symbols)[1])
initEq=15000
rm(list=ls(envir=.blotter),envir=.blotter)
ltportfolio='volatility'
ltaccount='volatility'
initPortf(ltportfolio,symbols, initDate=initDate)
initAcct(ltaccount,portfolios=c(ltportfolio), initDate=initDate,initEq=initEq)
currency("USD")
future(symbols[1],currency="USD",multiplier=50,1/4)
 
signal[is.na(signal)]<-0
 
counter<-0
 
for(i in 2:length(signal))
{
	currentDate= time(signal)[i]
	equity = initEq #getEndEq(ltaccount, currentDate)
	position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)	
	print(position)
	print(currentDate)
	if(position==0 &counter==0)
	{		
		#open a new position if signal is >0
		if(signal[i]>0)
		{
			print('open position')
			closePrice<-as.double(get(symbols[1])[currentDate])
			print(closePrice)
			unitSize = 1#as.numeric(trunc((equity/closePrice)))
			print(unitSize)
			commssions=-2
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = unitSize , TxnFees=commssions, verbose=T)
			counter<-1
		}
 
	}
	else
	{
		#position is open. If signal is 0 - close it.
		if(position>0 & as.integer(signal[i])==0 &counter>=3)
		{
			position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)
			closePrice<-as.double(get(symbols[1])[currentDate])#as.double(get(symbols[1])[i+100])
			commssions=-2
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = -position , TxnFees=commssions, verbose=T)
			counter<-0
		}
		else
			counter<-counter+1
 
	}	
 
	updatePortf(ltportfolio, Dates = currentDate)
	updateAcct(ltaccount, Dates = currentDate)
	updateEndEq(ltaccount, Dates = currentDate)
}
rez1<-(getPortfolio(ltaccount))
#-------------------------results---------------------
#net profit
png('vix.png',width=650)
chart.TimeSeries(cumsum(rez1$symbols$es$txn[,9]),main='ES future contract')
dev.off()

Well, that is exact opposite of expectations – if we expect volatility increase, as it was described in the first post, then the returns of S&P index have to be negative in long run.

From the beginning I suspected, that it has more to do with standard deviation formula and less with forecast.
Now funny part – I generated 2500 random returns and got median 0.9930  and mean 1.6360 for all days. Then I took all days, when buy signal suppose to be generated and guess what mean did I get? Median was 4.3170  and mean 6.3450. Once again, significant difference but on random data.

Source code on github

Comments (5)

Seasonal pair trading

quanttrader.info is a good quantitative repository, where I found an idea about seasonal spreads play.

The idea of seasonal pair trading differs from pairs trading in a way, that it doesn’t try to find deviation from the spread’s mean, but it looks at seasonal spread patterns. In some cases it is easier to find an explanation, why seasonal spread works at all. For example, during the winter time the consumption of heating oil goes up, but it is opposite for gasoline. During the summer is just opposite – because of holidays the demand for gasoline shuts up.

The data

Be aware, that you can obtain different results, because a lot of depends on the data quality and understanding of the data. In real world continuous contract doesn’t exist. Yes, there is some shops/brokers which provide such contract, but NYMEX exchange has future contracts with fix duration. In later case, you have to derive your own continuous contract and revolve it each month in case it is front month contract.

To run this test, I took the data from here. I had to relay on freely available data in this case, because I don’t have access to commercial data for such long period. Let me know, if have substantial differences in the result of this test with others data providers. Here is some differences between my results and the results share by quanttrader.info.

The test

First of all, let’s plot cumulative returns of the oil (CL) and the gasoline (RB) front month contracts:

Photobucket

The next graph shows cumulative spread between CL and RB in percentage terms. It is difficult to spot any seasonal pattern just by looking at it, except that during some years it was trending down. This can be a problem for long term investment (let say more that 3 months – it is just an educated guess).

Photobucket

Let’s look what are daily averages aggregated by month in percentage terms:

01 -0.12%
02 0.01%
03 -0.44%
04 -0.08%
05 0.02%
06 0.28%
07 -0.04%
08 -0.03%
09 0.348%
10 0.12%
11 -0.009%
12 -0.07%
As we can see, here is 3 months (in bold), which have average returns deviated from its daily mean -0.0029%. Because averages can be misleading, it is worth to check intervals of these averages. But this time, instead of daily means I used monthly returns to generate following graph:

Photobucket

The graph above shows, that some months had the returns around zero or the returns were distributed very wildly, for example like August. However, during March, June and September the returns were very consistent. Let’s take a look on March’s cumulative return:

Photobucket

Here is the problem – during the last years the curve flattened and March’s returns are close to zero.Well, it basically means, that you have to avoid investing in spread during this month.

Now, let’s check what were the cumulative returns of June (black) and September (red)?

Photobucket

This time the returns are much more consistent and can be used for further development.

The final word

The results of my study do not support the result obtained by Paul Teetor. Most likely the differences come from the data. I used free data and I can’t be sure, that this data repository can be trusted. In this study I used front month contracts, which are expiring in the same month. If you try the same study with the following month, then results will be different as well.

Paul Teetor mentioned in his study, that he prefers to deal with dolor returns, however my study is based on price returns. I tried to obtain the hedge value 1.13 disclosed in his study, but I got it my way as presented below. The hedge value is important, because you have to know how much invest in each asset. The reason for that is, that each asset can have different volatility and you need different amount of money for short leg and another amount for long leg. Below is the graph where you can see yearly difference between volatilities of Cl and RB:

Photobucket

When the value is above zero, then you have underweight oil and overweight gasoline, because the latter is less volatile. By the way, this graph doesn’t provide the hedge ratio – it is just proof of concept.

The source file the can be find on github or by clicking on View Code below.

?View Code RSPLUS
require('xts')
require('quantmod')
Sys.setenv(TZ="GMT")
 
require('PerformanceAnalytics')
 
tmp<-as.matrix(read.table('tickers/various_day_close/rb_contract1.csv',sep=',',header=FALSE))
rb<-as.xts(as.double(tmp[,2]),order.by=as.POSIXct(strptime(tmp[,1],'%Y-%m-%d')))
 
rb<-tail(rb,-3)['::2010-11']
 
 
tmp<-as.matrix(read.table('tickers/various_day_close/cl_contract1.csv',sep=',',header=FALSE))
cl<-as.xts(as.double(tmp[,2]),order.by=as.POSIXct(strptime(tmp[,1],'%Y-%m-%d')))
cl<-tail(cl,-3)['::2010-11']
 
 
rb.delta<-Delt(((rb)))['1997-01::']
cl.delta<-Delt(((cl)))['1997-01::']
 
rb.delta[is.na(rb.delta)]<-0
cl.delta[is.na(cl.delta)]<-0
 
spread<-cl.delta*50000-rb.delta*50000
 
png('spread_cl_prices.png',width=650)
chart.CumReturns(cbind(cl.delta,rb.delta),col=c(2,3),main='Oil & Gasoline prices')
dev.off()
 
png('spread_cl_rb.png',width=650)
chart.TimeSeries(cumsum(spread),main='Seasonal spread: CL vs RB')
dev.off()
 
spread<-cl.delta-rb.delta
 
png('spread_cl_rb_prc.png',width=650)
chart.CumReturns((spread),main='Seasonal spread %: CL vs RB')
dev.off()
 
spread.factor<-as.factor(format(index(spread),'%m'))
aggregate(spread, spread.factor,mean)
summary(lm(as.double(spread)~(spread.factor)))
 
 
 
rb.delta.monthly<-Delt(Cl(to.monthly(rb)))['1997-01::']
cl.delta.monthly<-Delt(Cl(to.monthly(cl)))['1997-01::']
 
rb.delta.monthly[is.na(rb.delta.monthly)]<-0
cl.delta.monthly[is.na(cl.delta.monthly)]<-0
 
factor<-as.factor(format(index(cl.delta.monthly-rb.delta.monthly),'%m'))
tmp<-data.frame(as.double(cl.delta.monthly-rb.delta.monthly),as.numeric(factor))
 
require('ggplot2')
png('monthly_averages.png',width=650)
qplot(factor(as.numeric(factor)),as.double(cl.delta.monthly-rb.delta.monthly),data=tmp,geom = "boxplot",ylab='Monthly average returns',xlab='Months')
dev.off()
 
png('march_cumulative.png',width=650)
chart.CumReturns(spread[spread.factor=='03'],main='March cumulative return')
dev.off()
 
png('yearly_diff.png',width=650)
chart.TimeSeries(cbind(as.xts(rollapply(rb.delta,250,sd,align='right'))-as.xts(rollapply(cl.delta,250,sd,align='right'))),main='Yearly difference of vol. between CL & RB')
dev.off()

Comments (13)