Saturday, August 18, 2012

An example of OOP in GNU R using S4 Classes

Recently I have discussed with my friend from WLOG Solutions an implementation of banking cash management engine in GNU R. The code made a nice use of S4 classes so I thought it would be worth showing as an example.

The problem

Every commercial bank needs to provide its customers with access to cash via hundreds of cash access points like ATMs or branches. Bank managers facing this problem have to handle three conflicting objectives: (1) they have to ensure that there is enough cash in cash access points to maintain liquidity, (2) they want to minimize the amount of cash frozen because it is not working for bank and (3) they want to minimize transportation costs from central vault to access points. This is a complex optimization problem which in particular involves the need to predict cash balance in access point every day using historic data.
When designing a solution providing forecasts of cash balances in access points a case for typical application of object oriented approach arises. For each cash point we supply historical data having the same structure and want to obtain a balance prediction. However different access points have different customer usage characteristics and will require different statistical forecasting models. For example in ATM one can only withdraw the money but in branch you can as well make a deposit. Therefore we model the bank using S4 classes.

The implementation

There is a Bank class that can have many CashPoints associated with it. CashPoint is a virtual class that has two implementing classes ATM and Branch. This structure is shown on figure below. Each CashPoint holds its historical balances and has a givePrediction() method that provides a forecast. This method will be implemented differently in ATM and Branch classes.
The example code implementing this structure is given on listing below. First using setGeneric() function we create a generic function givePrediction() that will dispatch appropriate methods following the class of its arguments. Next we create definitions of Bank, ATM and Branch classes of S4 type using setClass() function and create formal method for givePrediction() function for those classes. In our example for ATMs we use linear regression and for Branches simple mean as balance predictors. Notice this method defined for CashPoint class will be invoked if it will not be overridden by appropriate methods in subclasses (it is not possible to create an object of class CashPoint as it is defined virtual).
The code is run by invoking givePrediction() function on a new Bank class instance. The constructor of Bank class reads bank structure data from bank_model.csv file that contains the list of cash points with their ids and types (ATM or Branch). Next it invokes creation of CashPoint-s. Each cash point is initialized with data from branch_balances_data.csv which contains three columns: BranchId, Date, Balance. An appropriate subset of data is first selected using BranchId column. Date and Balance are retained in balances field and contain historical data for this cash point. After creation of an object of type Bank its givePrediction() method is invoked which calls automatically via S4 class system either ATM or Branch givePrediction() method according to the run-time type of cash point.
And here is the code. I hope you will find this simple example of S4 classes application useful.

setGeneric("givePrediction", function(object) {
    standardGeneric("givePrediction")
})

setClass("Bank", representation(cashPoints = "list"))
setMethod("initialize", "Bank", function(.Object){
    BankModel <- read.table(file = "bank_model.csv",
        sep = ";", header = TRUE, stringsAsFactors = FALSE)
    .Object@cashPoints <- apply(BankModel, 1, function(cp) {
        new(cp[2], cp[1])
    })
    names(.Object@cashPoints) <- apply(BankModel, 1,
        paste, collapse = "_")
    return(.Object)
})
setMethod("givePrediction", "Bank", function(object){
    return(sapply(object@cashPoints, "givePrediction"))
})

setClass("CashPoint", representation(id = "character",
    balances = "data.frame", "VIRTUAL"))
setMethod("initialize", "CashPoint", function(.Object, cashPointId){
    .Object@id <- cashPointId
    balances <- read.table(file = "branch_balances_data.csv",
        sep = ";", header = TRUE)
    .Object@balances <- subset(balances,
        balances$BranchId == .Object@id, -BranchId)
    .Object@balances$Date <- as.Date(.Object@balances$Date)
    return(.Object)
})
setMethod("givePrediction", "CashPoint", function(object){
    stop("no givePrediction method for this class")
})

setClass("Branch", contains = "CashPoint")
setMethod("givePrediction", "Branch", function(object){
    return(mean(object@balances$Balance))
})

setClass("ATM", contains = "CashPoint")
setMethod("givePrediction", "ATM", function(object) {
    LM <- lm(Balance ~ as.numeric(Date), data = object@balances)
    prediction <- predict(LM,
        data.frame(Date = 1 + max(object@balances$Date)))
    return(unname(prediction))
})

print(givePrediction(new("Bank")))

To run the code you need to create bank_model.csv and branch_balances_data.csv files. Here you have sample truncated contents (you need to copy-paste the text given below and save in GNU R working directory under appropriate names).

bank_model.csv
CashPointId;CashPointType
CashPoint_1;ATM
CashPoint_2;Branch
CashPoint_3;ATM
CashPoint_4;Branch
CashPoint_5;ATM

branch_balances_data.csv
BranchId;Date;Balance
CashPoint_1;2012-12-01;423000
CashPoint_1;2012-12-02;312000
CashPoint_1;2012-12-03;220000
CashPoint_1;2012-12-04;123000
CashPoint_2;2012-12-01;223000
CashPoint_2;2012-12-02;212000
CashPoint_2;2012-12-03;320000
CashPoint_2;2012-12-04;223000
CashPoint_3;2012-12-01;323000
CashPoint_3;2012-12-02;312000
CashPoint_3;2012-12-03;270000
CashPoint_3;2012-12-04;223000
CashPoint_4;2012-12-01;323000
CashPoint_4;2012-12-02;412000
CashPoint_4;2012-12-03;320000
CashPoint_4;2012-12-04;373000
CashPoint_5;2012-12-01;223000
CashPoint_5;2012-12-02;192000
CashPoint_5;2012-12-03;150000
CashPoint_5;2012-12-04;133000

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.