Archive for quant

timezone issue in R

While investigating Intraday patterns in FX returns and order flow paper I have faced the problem with timezone. I had 3 data sources with different timezones (GMT, CET, CEST). Most confusing thing was, that I didn’t know, how to deal with summer time.
But why did I have the data with summer time in the first place?  Well, I use IBrokers package to get latest Forex data and turns out, that it is impossible to specify timezone parameter within reqHistoricalData function. Once the data is received, it assigns default timezone of R (in my case GMT), but the requested data comes with OS timezone (CET/CEST in my case). It took for while to realize that, but the real challenge was how to convert CET/CEST to GMT.

The answer – don’t use CET/CEST, but instead of that use something like ‘Europe/Paris’ or ‘Europe/Berlin’. This approach takes into account summer time issue and you don’t need to worry about it. Once you have the data in ‘Europe/Paris’ or ‘Europe/Berlin’ format you can easily convert it:

?View Code RSPLUS
Sys.setenv(TZ="Europe/Paris")
eur.usd=reqHistoricalData(tws,currency,whatToShow='MIDPOINT',barSize='1 hour')
Sys.setenv(TZ="GMT")
eur.usd=xts(Op(eur.usd),tz='GMT')

or you can change the index:

?View Code RSPLUS
 format(eur.usd, tz="GMT",usetz=TRUE)

Comments

Transaction cost analysis and pre-trade analysis

Transaction cost analysis (TCA) is the framework to achieve best execution in trading context. TCA can be split into three groups: pre-trade analysis, intraday analysis, and post-trade measurement.

Pre-trade analysis allows us to get insight about the future volatility of the price, forecast intra-day and daily volumes, market impact. It evaluates all strategies and advises the strategy that is most consistent with manager preferences for the risk.

Intraday analysis is real time analysis, where the system ensure that the strategy performs in line with the forecast.

Post-trade analysis measures the implementation of the investment decision to ensure, that pre-trade models are accurate and best execution is delivered.

If you want to learn more about the transaction cost analysis, I highly recommend Optimal Trading Strategies: Quantitative Approaches for Managing Market Impact and Trading Risk by Robert Kissell. The book not only covers all type of mentioned analysis, but also considers the practical aspects of the implementation of trading strategy and algorithms. The book includes very interesting part about Principal Bid Transactions or blind bids.

To build a forecast model we need some input parameters. Pre-trade analysis can be split into two parts – fundamental part and statistically based part. It is much ease to derive fundamental facts such as market capitalization, company profile, the sectors in which company operates, than statistically based parameters. For the latter I’m going to use R and the raw stock price data to derive it. The former can easily be found on Internet, Bloomber or Reuter.

The first input parameter that I’m are going to derive is average daily trading volume of the traded security. It is very easy to obtain it – only daily data is necessary. I used 20 days rolling window to get average volume. Here’s an example of IBM security:

Photobucket

The graph shows, how the 20 days average volume evolved since last September – the minimum was 3.5 millions and the maximum was 6 millions traded a day. But the average can be misleading – it takes into account last 20 days and then it derives one number. In most of the case, we are going to deal with short time prediction such as 1 day or 1 hour, so approximation of 20 days does not make a lot sense.
One of possible solutions would be using confidence intervals estimate daily volume. 95% confidence interval is wide used in finance – I will use the same interval answering the following question: based on this interval, what volume will be traded next day.

Photobucket

The density diagram shows distribution of daily volume of IBM security. As you can see most of the days the volume was above 3 millions. However, 5% of the days the volume was below 3 millions. Based on this diagram, we can predict, that tomorrow’s volume will be higher than 3 millions.

Before we finish with the daily volume, we have to test for weekly seasonality. If there is weekday seasonality, then the volume forecast has to be adjusted.

Photobucket
The chart above clearly indicates, that the traded volume on Monday is below (~5%) the day average. There is increase in the volume on Friday, but the significance is under question. Let’s check density diagram to get rid of any doubt about the volatility on Monday and Friday.

Photobucket

The density diagram above shows, that Monday’s peak and the body are shifted to the left. On the other end we can see, that Friday’s body is aligned with others days, only it has a fat tail. Based on this diagram we can eliminate Friday’s volume adjustment and apply Monday’s adjustment only.

Once we finished with daily data we can move to intra-day. Then liquidating (or acquiring) a position in a security, the position has to be sliced into smaller parts to avoid influence of the market and to hide the intentions. For this reason, intra-day volume patterns have to be known. With that in mind, let’s look at how the volume is distributed in relation to the trading hour.

Photobucket

The graph above shows hourly volume pattern, where the volume is grouped by hour. The black dots indicate the median of the hour. Please keep in mind, that the trading starts at 9:30 and the first trading hour has only 30 minutes (if you want align the first hour to the others, then you need to multiply the volume of the first hour by two). As we can see, the first and the last are most traded and the volume drops in the middle of the day.

The next useful thing then liquidating a big position is average trade size. We need to know, how behaves average Joe and what he trades.

Photobucket

The chart above shows all trades grouped by hour – the black dots indicate median of the trades. The following table is supplementary to the chart – here you find median of the hour.

Hour 10 11 12 13 14 15 16
Volume 842.5 565.5 394.0 300.0 297.0 369.5 708.5

Once again, average trade size is much higher in the first and last hours and it drops ~2.5 times during the trading session.

Photobucket

The final chart shows the volatility grouped by hour. There is a lot jittery, then the market opens, it becomes calmer during the lunch time and slightly increases then the market closes. These are the numbers for each hour:

Hour 10 11 12 13 14 15 16
Volatility

(standard deviation)

0.16% 0.09% 0.08% 0.05% 0.05% 0.06% 0.07%

In this article we analyzed a list of potential input parameters for pre-trade analysis and further forecast. The list is not static and can be extended with supplementary parameters, such as bid-ask spread distribution and etc. The next step is to aggregate these parameters and build the models to forecast volatility and volume.

Comments (2)

Politinis savivaldybių žemėlapis

I’m sorry for my English readers – this post is about Lithuania political map and it is written in my native Lithuanian language. By the way, R-Language has been used and the code is shared on git.

Neseniai viesai.lt rašė apie savivaldybių efektyvumą ir 2009 m. išlaidas. Nepaisant to, kad viešos informacijos apie savivaldybes, jų valdymą ir politikus yra labai mažai, papildžiau viesai.lt tyrimą naujais duomenimis ir vizualine medžiaga.

Atlikdamas šį tyrimą, rėmiausi prielaida, kad meras yra atsakingas už visus spendimus savivaldybėje ir turi galutinį žodį sprendžiant visus klausimus ir yra atsakingas už išlaidas. Kadangi visi Lietuvos merai yra partiniai, savivaldybės yra valdomos tų partijų, kurioms priklauso merai. Pradžioje pažiūrėkime, kokio dydžio savivaldybės (pagal gyventojų skaičių) yra valdomos partijų.

Photobucket

Pirmame paveikslėlyje matome kiek savivaldybių priklauso tam tikrai partijai (arba kokiai partijai priklauso meras) ir gyventojų skaičių savivaldybėse. TS turi dvi savivaldybes, kurios yra didžiausios Lietuvoje, LSDP turi didžiausią kiekį jai atstovaujančių merų. Partijos, kurios turi tik po vieną atstovaujantį merą, buvo eliminuotos iš šio grafiko.

Photobucket

Antrame paveikslėlyje atvaizduota suminė informacija apie gyventojus ir partijas. TS partija atstovauja didžiausią gyventojų skaičių savivaldybėse, bet eliminavus dvi didžiausias savivaldybes, skaičius būtų panašus į LSDP. Partijos, kurios turi tik po vieną atstovaujantį merą, buvo eliminuotos iš šio grafiko.

Photobucket

Trečiame paveikslėlyje partijos sugrupuotos pagal jų atstovaujamą politinę orientaciją.

Photobucket

Viršuje esantis paveikslėlis rodo savivaldybių išlaidų tankumą (pgl. viesai.lt išlaidų skaičiavimo metodiką) . Išlaidų vidurkis yra tarp 8 ir 10 proc., todėl viesai.lt pateiktas  blogiausių savivaldybių palyginimas su Klaipėdos m. išlaidomis yra labai optimistinis.

Photobucket

Savivaldybių suskirstymas pagal partijas atskleidžia vieną galimą faktą – mažai savivaldybių turinčios partijos išleidžia jų valdymui daugiau. Paveikslėlyje, kuris yra viršuje, juodais taškais yra pažymėti išlaidų vidurkiai. Kaip matome, LLRA ir TT, kurios turi po dvi savivaldybes, išlaidų vidurkiai yra apie 14.5 proc., kai visų savivaldybių vidurkis yra mažesnis nei 10 proc. Šį faktą reiktų patikrinti įtraukiant kitų metų duomenis, nes dėl mažos imties galimas neteisingas duomenų interpretavimas.

Photobucket

Paskutiniame grafike duomenys atskleidžia seniai žinomą tiesą – mažos savivaldybės yra brangus malonumas. Matome, kaip išlaidų kreivė mažėja gyventojų skaičiui didėjant ir ties 10 (22 000 gyventojų) pasiekia išlaidų vidurkį. Įdomu tai, kad savivaldybėms didėjant, išlaidos nesikeičia.

Tyrimo išeities kodą ir csv failą su duomenimis galima rasti gitHub.

Comments (1)

Correlation network

I came up with an idea to draw correlation network to get a grasp about relationship between a list of stocks. An alternative way to show correlation matrix would be head map, which can have limitations with big matrices (>100).
Unfortunately,  ggplot2 package doesn’t have a easy way to draw the networks, so I was left with igraph or network. I tried both, but somehow chose igraph. If you want to master either package I highly recommend to start from theoretical part – it is very well written and it will save your time trying to understand the package’s conception.

Here is correlation matrix of stocks, which correlation coef. is more than 0.5:
correlation_network

?View Code RSPLUS
require(xts)
require(quantmod)
require(igraph)
cor_mat<- matrix( runif(100), nr=10 )
cor_mat[ lower.tri(cor_mat, diag=TRUE) ]<- 0
cor_mat[ abs(cor_mat) < 0.5]<- 0
 
graph <- graph.adjacency(cor_mat>0.5, weighted=TRUE, mode="upper")
E(graph)$weight<-t(cor_mat)[abs(t(cor_mat))>0.5]
E(graph)[ weight>0.7 ]$color <- "black" E(graph)[ weight>=0.65 &amp; weight<0.7 ]$color <- "red" E(graph)[ weight>=0.6 &amp;weight<0.65 ]$color <- "green" E(graph)[ weight>=0.55 &amp;weight<0.6 ]$color <- "blue"
E(graph)[ weight<0.55  ]$color <- "yellow"
V(graph)$label<- seq(1:10)#V(graph)$name
graph$layout <- layout.fruchterman.reingold
factor<-as.factor(cut(E(graph)$weight*10,c(4,5,6,7,8),labels=c(1,10,20,30))) png('corr_network.png',width=500) plot(decompose.graph(graph)[[which.max(sapply(decompose.graph(graph), vcount))]],edge.width =as.numeric(factor)*1.5,frame=T) legend("bottomleft", title="Colors", cex=0.75, pch=16, col=c("black", "blue","red", "green","pink"), legend=c(">70%", "65-70","60-65","55-60","50-55"), ncol=2)
dev.off()

The code can be found on github.

Comments (10)

Tick data retrieval

I just published Java based code to pull tick data from Interactive Brokers. There are thousands tools to get tick data from IB, but I had one feature in mind.

You can get maximum 50 quotes per second from Interactive Brokers (its IB limitation for TWS API) . Imagine a situation, when there is a delay in swapping incoming information, because I\O process is very slow or a short overload of the system. In such case either some piece of data will be lost or the system will crash. OK, let’s say you have plenty of RAM and speedy hard drive. Does it make sense for real time trading system to write tick data into disk and then pass all information further? Can it be done asynchronous? Yes and Java Message Service was build for that.

So, my idea was to build a tool, which would grab the tick data from provider and pass it to JMS. Retrieval tool doesn’t care what happens next – disk crash, heavy processing of the data or save on the storage. On the other end – JMS can have one or many clients and it will pass all incoming information. If something happens to a client during the transfer of the information – JMS will take care of it – it will wait for fallen clients by preserving incoming information.

If you are looking for how to stick together JMS, ActiveMQ, Spring, Hibernate, JPA and Maven, then this code can help you as well.

Comments (2)

Interesting volatility measurement, part 2

A few weeks ago I have mentioned about an interesting volatility prediction. It is based on two periods of historical volatility (standard deviation). The remaining question was – does it really works? I could not give the answer, because I didn’t have VIX futures data at that time. Later on, I was contacted by Brian G. Peterson, who provided necessary data to finish this test. By the way, I just found, that CBOE shares VIX futures data on its website.

Now I want you to show, what are returns of VIX futures for the next 3 days, then historical volatility ratio of 3 days vs 10 days is less than 0.25:

Photobucket

?View Code RSPLUS
 
Sys.setenv(TZ="GMT")
require('xts')
require('quantmod')
require('blotter')
require('PerformanceAnalytics')
 
tmp<-as.matrix(read.table('tickers/various_day_close/VIXc1.csv',sep=',',header=TRUE))
vix<-as.xts(as.double(tmp[,9]),order.by=as.POSIXct(strptime(tmp[,2],'%d-%b-%Y'),tz='GMT'))
vix<-(vix[!is.na(vix)])
colnames(vix)<-c('Close')
 
 
tmp<-as.matrix(read.table('tickers/various_day_close/ESc1.csv',sep=',',header=TRUE))
es<-as.xts(as.double(tmp[,9]),order.by=as.POSIXct(strptime(tmp[,2],'%d-%b-%Y')))
es<-(es[!is.na(es)])
colnames(es)<-c('Close')
 
#-----------------data end-----------------
 
 
#-----------------signal-------------------
es.delta<-Delt(Cl(es))
delta<-Delt(Cl(vix))#Front contract
 
#Historical volatility during 3 and 10 days
short.vol<-as.xts(rollapply(es.delta,3,sd,align='right'))
long.vol<-as.xts(rollapply(es.delta,10,sd,align='right'))
 
past.vol<-short.vol/long.vol
future.vol<-lag(past.vol,-3)
future.delta<-lag(vix,-3)/vix-1
 
signal<-ifelse(past.vol<0.25,1,0)
 
#here we see, increase in historical volatility
summary(as.double(future.vol[index(signal[signal!=0])]))/summary(as.double(past.vol[index(signal[signal!=0])]))
 
#-----------------signal end-------------------
 
#--------------blotter code------------------
symbols<-c('vix')
 
initDate=time(get(symbols)[1])
initEq=50000
rm(list=ls(envir=.blotter),envir=.blotter)
ltportfolio='volatility'
ltaccount='volatility'
initPortf(ltportfolio,symbols, initDate=initDate)
initAcct(ltaccount,portfolios=c(ltportfolio), initDate=initDate,initEq=initEq)
currency("USD")
stock(symbols[1],currency="USD")
 
signal<-signal[index(vix)]
 
signal[is.na(signal)]<-0
 
counter<-0 #date counter - exit on 3th day
 
for(i in 2:length(signal))
{
	currentDate= time(signal)[i]
	equity = initEq #getEndEq(ltaccount, currentDate)
	position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)	
	print(position)
	print(currentDate)
	if(position==0 &counter==0)
	{		
		#open a new position if signal is >0
		if(signal[i]>0)
		{
			print('open position')
			closePrice<-as.double(get(symbols[1])[currentDate])
			print(closePrice)
			unitSize = as.numeric(trunc((equity/closePrice)))
			print(unitSize)
			commssions=-unitSize*closePrice*0.0003
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = unitSize , TxnFees=commssions, verbose=T)
			counter<-1
		}
 
	}
	else
	{
		#position is open. If signal is 0 - close it.
		if(position>0 & as.integer(signal[i])==0 &counter>=3)
		{
			position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)
			closePrice<-as.double(get(symbols[1])[currentDate])#as.double(get(symbols[1])[i+100])
			commssions=-position*closePrice*0.0003
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = -position , TxnFees=commssions, verbose=T)
			counter<-0
		}
		else
			counter<-counter+1
 
	}	
	print('>>>>>>>>>>>>')
	updatePortf(ltportfolio, Dates = currentDate)
	updateAcct(ltaccount, Dates = currentDate)
	updateEndEq(ltaccount, Dates = currentDate)
}
rez1<-(getPortfolio(ltaccount))
 
#--------------blotter code end------------------
 
#----------------results------------------------
png('vix_front.png',width=650)
#net profit - commissions, slipage excluded
chart.TimeSeries(cumsum(rez1$symbols$vix$txn[,7]),main='VIX front contract')
dev.off()
#----------------results end------------------------

The graph shows, that this strategy is pure random or just follows VIX index. Now let’s see, what are returns of this strategy, if S&P500 futures are used instead of VIX.

Photobucket

?View Code RSPLUS
 
signal<-ifelse(past.vol<0.25,1,0)
#signal<-signal[index(es)]
 
 
 
#------------------------blotter code-----------------------
symbols<-c('es')
 
initDate=time(get(symbols)[1])
initEq=15000
rm(list=ls(envir=.blotter),envir=.blotter)
ltportfolio='volatility'
ltaccount='volatility'
initPortf(ltportfolio,symbols, initDate=initDate)
initAcct(ltaccount,portfolios=c(ltportfolio), initDate=initDate,initEq=initEq)
currency("USD")
future(symbols[1],currency="USD",multiplier=50,1/4)
 
signal[is.na(signal)]<-0
 
counter<-0
 
for(i in 2:length(signal))
{
	currentDate= time(signal)[i]
	equity = initEq #getEndEq(ltaccount, currentDate)
	position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)	
	print(position)
	print(currentDate)
	if(position==0 &counter==0)
	{		
		#open a new position if signal is >0
		if(signal[i]>0)
		{
			print('open position')
			closePrice<-as.double(get(symbols[1])[currentDate])
			print(closePrice)
			unitSize = 1#as.numeric(trunc((equity/closePrice)))
			print(unitSize)
			commssions=-2
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = unitSize , TxnFees=commssions, verbose=T)
			counter<-1
		}
 
	}
	else
	{
		#position is open. If signal is 0 - close it.
		if(position>0 & as.integer(signal[i])==0 &counter>=3)
		{
			position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)
			closePrice<-as.double(get(symbols[1])[currentDate])#as.double(get(symbols[1])[i+100])
			commssions=-2
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = -position , TxnFees=commssions, verbose=T)
			counter<-0
		}
		else
			counter<-counter+1
 
	}	
 
	updatePortf(ltportfolio, Dates = currentDate)
	updateAcct(ltaccount, Dates = currentDate)
	updateEndEq(ltaccount, Dates = currentDate)
}
rez1<-(getPortfolio(ltaccount))
#-------------------------results---------------------
#net profit
png('vix.png',width=650)
chart.TimeSeries(cumsum(rez1$symbols$es$txn[,9]),main='ES future contract')
dev.off()

Well, that is exact opposite of expectations – if we expect volatility increase, as it was described in the first post, then the returns of S&P index have to be negative in long run.

From the beginning I suspected, that it has more to do with standard deviation formula and less with forecast.
Now funny part – I generated 2500 random returns and got median 0.9930  and mean 1.6360 for all days. Then I took all days, when buy signal suppose to be generated and guess what mean did I get? Median was 4.3170  and mean 6.3450. Once again, significant difference but on random data.

Source code on github

Comments (5)

Seasonal pair trading

quanttrader.info is a good quantitative repository, where I found an idea about seasonal spreads play.

The idea of seasonal pair trading differs from pairs trading in a way, that it doesn’t try to find deviation from the spread’s mean, but it looks at seasonal spread patterns. In some cases it is easier to find an explanation, why seasonal spread works at all. For example, during the winter time the consumption of heating oil goes up, but it is opposite for gasoline. During the summer is just opposite – because of holidays the demand for gasoline shuts up.

The data

Be aware, that you can obtain different results, because a lot of depends on the data quality and understanding of the data. In real world continuous contract doesn’t exist. Yes, there is some shops/brokers which provide such contract, but NYMEX exchange has future contracts with fix duration. In later case, you have to derive your own continuous contract and revolve it each month in case it is front month contract.

To run this test, I took the data from here. I had to relay on freely available data in this case, because I don’t have access to commercial data for such long period. Let me know, if have substantial differences in the result of this test with others data providers. Here is some differences between my results and the results share by quanttrader.info.

The test

First of all, let’s plot cumulative returns of the oil (CL) and the gasoline (RB) front month contracts:

Photobucket

The next graph shows cumulative spread between CL and RB in percentage terms. It is difficult to spot any seasonal pattern just by looking at it, except that during some years it was trending down. This can be a problem for long term investment (let say more that 3 months – it is just an educated guess).

Photobucket

Let’s look what are daily averages aggregated by month in percentage terms:

01 -0.12%
02 0.01%
03 -0.44%
04 -0.08%
05 0.02%
06 0.28%
07 -0.04%
08 -0.03%
09 0.348%
10 0.12%
11 -0.009%
12 -0.07%
As we can see, here is 3 months (in bold), which have average returns deviated from its daily mean -0.0029%. Because averages can be misleading, it is worth to check intervals of these averages. But this time, instead of daily means I used monthly returns to generate following graph:

Photobucket

The graph above shows, that some months had the returns around zero or the returns were distributed very wildly, for example like August. However, during March, June and September the returns were very consistent. Let’s take a look on March’s cumulative return:

Photobucket

Here is the problem – during the last years the curve flattened and March’s returns are close to zero.Well, it basically means, that you have to avoid investing in spread during this month.

Now, let’s check what were the cumulative returns of June (black) and September (red)?

Photobucket

This time the returns are much more consistent and can be used for further development.

The final word

The results of my study do not support the result obtained by Paul Teetor. Most likely the differences come from the data. I used free data and I can’t be sure, that this data repository can be trusted. In this study I used front month contracts, which are expiring in the same month. If you try the same study with the following month, then results will be different as well.

Paul Teetor mentioned in his study, that he prefers to deal with dolor returns, however my study is based on price returns. I tried to obtain the hedge value 1.13 disclosed in his study, but I got it my way as presented below. The hedge value is important, because you have to know how much invest in each asset. The reason for that is, that each asset can have different volatility and you need different amount of money for short leg and another amount for long leg. Below is the graph where you can see yearly difference between volatilities of Cl and RB:

Photobucket

When the value is above zero, then you have underweight oil and overweight gasoline, because the latter is less volatile. By the way, this graph doesn’t provide the hedge ratio – it is just proof of concept.

The source file the can be find on github or by clicking on View Code below.

?View Code RSPLUS
require('xts')
require('quantmod')
Sys.setenv(TZ="GMT")
 
require('PerformanceAnalytics')
 
tmp<-as.matrix(read.table('tickers/various_day_close/rb_contract1.csv',sep=',',header=FALSE))
rb<-as.xts(as.double(tmp[,2]),order.by=as.POSIXct(strptime(tmp[,1],'%Y-%m-%d')))
 
rb<-tail(rb,-3)['::2010-11']
 
 
tmp<-as.matrix(read.table('tickers/various_day_close/cl_contract1.csv',sep=',',header=FALSE))
cl<-as.xts(as.double(tmp[,2]),order.by=as.POSIXct(strptime(tmp[,1],'%Y-%m-%d')))
cl<-tail(cl,-3)['::2010-11']
 
 
rb.delta<-Delt(((rb)))['1997-01::']
cl.delta<-Delt(((cl)))['1997-01::']
 
rb.delta[is.na(rb.delta)]<-0
cl.delta[is.na(cl.delta)]<-0
 
spread<-cl.delta*50000-rb.delta*50000
 
png('spread_cl_prices.png',width=650)
chart.CumReturns(cbind(cl.delta,rb.delta),col=c(2,3),main='Oil & Gasoline prices')
dev.off()
 
png('spread_cl_rb.png',width=650)
chart.TimeSeries(cumsum(spread),main='Seasonal spread: CL vs RB')
dev.off()
 
spread<-cl.delta-rb.delta
 
png('spread_cl_rb_prc.png',width=650)
chart.CumReturns((spread),main='Seasonal spread %: CL vs RB')
dev.off()
 
spread.factor<-as.factor(format(index(spread),'%m'))
aggregate(spread, spread.factor,mean)
summary(lm(as.double(spread)~(spread.factor)))
 
 
 
rb.delta.monthly<-Delt(Cl(to.monthly(rb)))['1997-01::']
cl.delta.monthly<-Delt(Cl(to.monthly(cl)))['1997-01::']
 
rb.delta.monthly[is.na(rb.delta.monthly)]<-0
cl.delta.monthly[is.na(cl.delta.monthly)]<-0
 
factor<-as.factor(format(index(cl.delta.monthly-rb.delta.monthly),'%m'))
tmp<-data.frame(as.double(cl.delta.monthly-rb.delta.monthly),as.numeric(factor))
 
require('ggplot2')
png('monthly_averages.png',width=650)
qplot(factor(as.numeric(factor)),as.double(cl.delta.monthly-rb.delta.monthly),data=tmp,geom = "boxplot",ylab='Monthly average returns',xlab='Months')
dev.off()
 
png('march_cumulative.png',width=650)
chart.CumReturns(spread[spread.factor=='03'],main='March cumulative return')
dev.off()
 
png('yearly_diff.png',width=650)
chart.TimeSeries(cbind(as.xts(rollapply(rb.delta,250,sd,align='right'))-as.xts(rollapply(cl.delta,250,sd,align='right'))),main='Yearly difference of vol. between CL & RB')
dev.off()

Comments (13)

High readings of VIX index during 2 days

During last two sessions (December 23th and 27th), VIX index posted returns (close to close) above 6 %. My question is – what return from S&P500 index can we expect next day after such event?

As you can see from the graph below, expected return from S&P500 index is positive. During 1995-2010 were 53 such events and mean return was 1.02 % and median 0.6% and win rate 65%.
What can be the explanation for such consistency in returns? It is known, that volatility is mean reverting process and the value of VIX index tends to return to its mean. So, on third day volatility dries up and S&P500 index on average shows positive return that day.

Photobucket

Worth to note, that despite VIX index spike, S&P 500 index was very very still during the last days.

?View Code RSPLUS
 
require('xts')
require('quantmod')
require('blotter')
require('PerformanceAnalytics')
require('FinancialInstrument')
Sys.setenv(TZ="GMT")
 
#data part
getSymbols(c('SPY','^VIX'),from='1995-01-01',index.class=c("POSIXt","POSIXct"))
dividends<-getDividends('SPY',from='1995-01-01',index.class=c("POSIXt","POSIXct"))
 
temp<-cbind(dividends,SPY)
temp[,1][is.na(temp[,1])]<-0
 
SPY<-cbind(temp[,2],temp[,3],temp[,4],temp[,1]+temp[,5])
colnames(SPY)<-c('Open','High','Low','Close')
spy.delta<-Delt(Cl(SPY))
 
vix.delta<-Delt(Cl(VIX))
 
signal<-ifelse(vix.delta>0.06& lag(vix.delta>0.06,1),1,0)
png('2highdays.png',width=650)
chart.CumReturns(lag(signal,1)*spy.delta,main='when VIX >6% during two days')
dev.off()
 
#blotter code
symbols<-c('SPY')
SPY<-Cl(SPY)
initDate=time(get(symbols)[1])
initEq=50000
rm(list=ls(envir=.blotter),envir=.blotter)
ltportfolio='2high'
ltaccount='2high'
initPortf(ltportfolio,symbols, initDate=initDate)
initAcct(ltaccount,portfolios=c(ltportfolio), initDate=initDate,initEq=initEq)
currency("USD")
stock(symbols[1],currency="USD",multiplier=1)
 
signal[is.na(signal)]<-0
 
 
for(i in 2:length(signal))
{
	currentDate= time(signal)[i]
	equity = initEq #getEndEq(ltaccount, currentDate)
	position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)	
	print(position)
	print(currentDate)
	if(position==0)
	{		
		#open a new position if signal is >0
		if(signal[i]>0 )
		{
			print('open position')
			closePrice<-as.double(Cl(SPY[currentDate]))
			print(closePrice)
			unitSize = as.numeric(trunc((equity/closePrice)))
			print(unitSize)
			commssions=-unitSize*closePrice*0.0003
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = unitSize , TxnFees=commssions, verbose=T)
 
		}
 
	}
	else
	{
		#position is open. If signal is 0 - close it.
		if(position>0 &signal[i] ==0)
		{
			position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)
			closePrice<-as.double((Cl(SPY[currentDate])))#as.double(get(symbols[1])[i+100])
			commssions=-position*closePrice*0.0003
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = -position , TxnFees=commssions, verbose=T)
 
		}
 
	}	
	updatePortf(ltportfolio, Dates = currentDate)
	updateAcct(ltaccount, Dates = currentDate)
	updateEndEq(ltaccount, Dates = currentDate)
}
rez1<-(getPortfolio(ltaccount))
png('2highdays_2.png',width=650)
chart.CumReturns(rez1$symbols$SPY$txn[,7]/initEq)
dev.off()

Comments (6)

Interesting volatility measurement

Long time ago I stumbled across interesting volatility measurement at quantifiableedges.blogspot.com.

The idea is following: take 3-day historical volatility of S&P 500 index and divide that by 10-day historical volatility. Then mark all points which are less that 0.25 and measure the volatility of 3 following days. On average, the volatility of following 3 days will be 5 times higher.

?View Code RSPLUS
require('xts')
require('quantmod')
Sys.setenv(TZ="GMT")
 
getSymbols(c('SPY','^VIX'),from='1995-01-01')
 
spy.delta<-Delt(Cl(SPY))
 
short.vol<-as.xts(rollapply(spy.delta,3,sd,align='right'))
long.vol<-as.xts(rollapply(spy.delta,10,sd,align='right'))
 
future.vol<-(head(lag(short.vol,-3),-3))
 
past.vol<-short.vol/long.vol
 
signal<-index(past.vol[past.vol<0.25])#ifelse(past.vol<0.3,1,0)
 
 
 
temp<-cbind(future.vol,short.vol)
temp<-(tail(temp,-1))
temp<-(head(temp,-3))
 
print('all days:')
summary(as.double(temp[,1])/as.double(temp[,2]))
 
print('days, then past volatility < 0.25:')
summary(as.double(temp[,1][signal])/as.double(temp[,2][signal]))

I was tweaking the result to squeeze some profit, but not so much luck. Basically, you need to trade either VIX index derivatives or S&P 500 index options to get direct impact. Before doing that, you need to test historical performance. Unfortunately, I don’t have data for these instruments. What about ETF, like VXX? Nope, because only few data points in the testing sample.

Later, I will try to incorporate GARCH model to see if this going to help. Any fresh ideas on this?

Comments (3)

3 weak days in a row

Recently, Trading the odds posted one of many flavors of mean reverting strategies and I decided to get my hands dirty by writing R code and testing it.

You can find full description of the strategy by following latter link above. Long story short – if SPY shows lower open, high and close 3 days in a row, then buy on the close of third day and sell it 1 days later.
Let’s do simple test:

?View Code RSPLUS
require('xts')
require('quantmod')
getSymbols('SPY',from='1995-01-01',index.class=c("POSIXt","POSIXct"))
dividends=getDividends('SPY',from='1995-01-01',index.class=c("POSIXt","POSIXct"))
 
temp=cbind(dividends,SPY)
temp[,1][is.na(temp[,1])]=0
 
SPY=cbind(temp[,2],temp[,3],temp[,4],temp[,1]+temp[,5])
colnames(SPY)=c("Open","High","Low","Close")
 
#one day before
lag1=lag((SPY),1)
 
#two days defore
lag2=lag((SPY),2)
 
signal=ifelse( (Cl(lag2)>Cl(lag1) & Cl(lag1)>Cl(SPY))&
			(Hi(lag2)>Hi(lag1) & Hi(lag1)>Hi(SPY)) &
			(Op(lag2)>Op(lag1) & Op(lag1)>Op(SPY)),
			1,0
)
#one day later
lag3=lag(Cl(SPY),-1)
 
profit=(lag3/Cl(SPY)-1)*signal
profit[is.na(profit)]=0
png(file='first.png',width=500)
plot(cumprod(profit+1),main='Profit 1995-2010')
dev.off()

The code above supposed to produce something similar:

Photobucket

Nice curve, isn’t it? But neither commissions nor slippage were taken into account. So, let’s run more complicated test. For that purpose I utilized blotter package. Here’s the code:

?View Code RSPLUS
require('xts')
require('quantmod')
require('blotter')
require('PerformanceAnalytics')
require('FinancialInstrument')
getSymbols('SPY',from='1995-01-01',index.class=c("POSIXt","POSIXct"))
dividends=getDividends('SPY',from='1995-01-01',index.class=c("POSIXt","POSIXct"))
 
temp=cbind(dividends,SPY)
temp[,1][is.na(temp[,1])]=0
 
SPY=cbind(temp[,2],temp[,3],temp[,4],temp[,1]+temp[,5])
colnames(SPY)=c('Open','High','Low','Close')
 
#one day before
lag1=lag((SPY),1)
 
#two days defore
lag2=lag((SPY),2)
 
signal=ifelse( (Cl(lag2)>Cl(lag1) & Cl(lag1)>Cl(SPY))&
			(Hi(lag2)>Hi(lag1) & Hi(lag1)>Hi(SPY)) &
			(Op(lag2)>Op(lag1) & Op(lag1)>Op(SPY)),
			1,0
)
#one day later
lag3=lag(Cl(SPY),-1)
 
symbols=c('SPY')
 
initDate=index(get(symbols)[1])
initEq=10000
rm(list=ls(envir=.blotter),envir=.blotter)
ltportfolio='3days'
ltaccount='3days'
initPortf(ltportfolio,symbols, initDate=initDate)
initAcct(ltaccount,portfolios=c(ltportfolio), initDate=initDate,initEq=initEq)
currency("USD")
stock("SPY",currency="USD",multiplier=1)
 
signal[is.na(signal)]=0
 
counter<-0
 
for(i in 2:length(signal))
{
	currentDate= time(signal)[i]
	equity = 10000 #getEndEq(ltaccount, currentDate)
	#print(paste("equity ",equity))
	position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)	
	print(currentDate)
	if(position==0)
	{		
		#open a new position if signal is >0
		if(signal[i]>0 &counter ==0)
		{
			print('open position')
			closePrice<-as.double(Cl(SPY[currentDate]))
			unitSize = as.numeric(trunc((equity/closePrice)))
			commssions=-unitSize*closePrice*0.0003
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = unitSize , TxnFees=commssions, verbose=T)
			counter<-1
		}
 
	}
	else
	{
		#position is open. If signal is 0 - close it.
		if(position>0 &counter>=1)
		{
			print('close position>>>>')
			position = getPosQty(ltportfolio, Symbol=symbols[1], Date=currentDate)
			closePrice<-as.double((Cl(SPY[currentDate])))#as.double(get(symbols[1])[i+100])
			commssions=-position*closePrice*0.0003
			addTxn(ltportfolio, Symbol=symbols[1],  TxnDate=currentDate, TxnPrice=closePrice, TxnQty = -position , TxnFees=commssions, verbose=T)
			counter<-0
		}
		else
			counter<-counter+1
	}	
	#print(equity)
	updatePortf(ltportfolio, Dates = currentDate)
	updateAcct(ltaccount, Dates = currentDate)
	updateEndEq(ltaccount, Dates = currentDate)
 
	#equity = getEndEq(ltaccount, currentDate)
	#print(paste("equity ",equity))
}
result=rez1$symbols$SPY$txn[,7]
result=result[result!=0]
 
png(file='second.png',width=500)
#fix commission rate 2*3
plot(cumsum(result-6))
#next line will allow you to compare the performace with and without commissions
#chart.CumReturns(cbind((result)/10000,(result-6)/10000))
dev.off()

Photobucket

Nice curve, but let’s look beyond that. First of all, here’s nice function in PerformanceAnalytics package, AnnulizedReturns:
table.AnnualizedReturns((result-6)/10000)
Gross.Txn.Realized.PL
Annualized Return 0.0265
Annualized Std Dev 0.0494
Annualized Sharpe (Rf=0%) 0.5366

Well, Sharpe ratio is not impressive. The profit percentage of this strategy is 57% and mean of profitable return is 111$ against 98$ loss. Profit factor is ~1.55.

I think, this strategy can be as one of the parameter or vote in another system, but alone it is weak.

Comments (5)

« Previous Page · Next Page »