Novel Product Database Using Vectors

The conventional approach to information systems is ‘three-tiered’: consisting of
a database, and analysis system and a presentation system. For example,
a low-cost system for predictive business analytics might use a MySQL database,
the statistics languge R for analytics, and a browser for presentation of results.

While there are advantages in
partitioning clearly delineating levels, inefficiencies and errors introduced when transferring
data between levels, particularly in the reading and writing of data to databases.
A novel approach producing very efficient systems for financal analysis is to
integrate database and analysis functions into the one, memory resident application.
Using this approach the vector-based K language has produced
the fastest industry trading applications in existence — over 100 times
faster than equivalent three-tiered applications.

In this section we show how to use R as a database by replicating
relational database operations including select and join.
As well as simplicity and efficiency for smaller systems,
the simple customer example helps to build knowledge of the
R’s powerful indexing operations.

Customer example

First we create a simple customer database consisting of three tables: customer, product and order.

>customerproductorder< -data.frame(order_no=c(12,13,14,15,16,17),
customer_id=c(1,1,2,2,2,3),
date=c(20060711,20050823,20060101,20060112,20060202,20060401))

Adding a new order for a d40 camera is a simple operation using rbind.
Adding a new column to
a table similar and much simpler than modifying a table in a database, using the operation cbind.

>order< -rbind(order,data.frame(order_no=18,customer_id=4,product_id="d40",date=20060606))

Now lets use a basic database operation, SELECT. To select all orders for the d50 cameras type the following. The $ is used to specify a particular column in the order table. The comma is needed
because the table is two dimensional. It says to entire columns with a particular row value.

>order[order$product_id=="d40",]

order_no customer_id product_id     date
1        12           1        d40 20060711
3        14           2        d40 20060101
6        17           3        d40 20060401
11       18           4        d40 20060606

In order to count the number of sales of d40 cameras, the rows containing that product are slected
and the dimension of the result is used to return the number of rows.

>dim(order[order$product_id=="d40",])[1]

4

Now, say we wanted to determine the total sales of each product line.
In a conventional database we would first join the two tables product and orders
with an SQL operation something like:

SELECT * FROM product, order WHERE product.product_id = order=order.product_id.

In R the equivalent command would use the merge command (and save as a temporary variable m):

>m< -merge(product,order,by.product="product_id",by.order="product_id")

product_id description  price order_no customer_id     date
1        d40      camera 599.00       12           1 20060711
2        d40      camera 599.00       14           2 20060101
3        d40      camera 599.00       17           3 20060401
4        d40      camera 599.00       18           4 20060606
5        d41      camera 233.33       16           2 20060202
6       mx90     printer 123.00       13           1 20050823
7       mx90     printer 123.00       15           2 20060112

In R we would then use the function aggregate to sum the prices for each product.

> aggregate(m$price,by=list(m$product_id),sum)
  Group.1       x
1     d40 2396.00
2     d41  233.33
3    mx90  246.00

Finally to save the database, including all the tables and temporary variables,
execute save.image(file=”filename”). The data is recovered with
load(“filename”) or R can be started in that directory.

These examples show that all of the major functions of a database that might be needed
for accessing and preparing data for predictive analytics can be performed with
simple statements in the R language. For more examples of using R as a database
with examples from niche modeling see the post here.

Advertisements

0 thoughts on “Novel Product Database Using Vectors

  1. Pingback: wynajem samochodów

  2. Pingback: wypozyczalnia lawet

  3. Pingback: kliknij tutaj

  4. Pingback: polecam

  5. Pingback: zobacz oferte

  6. Pingback: najcudowniejsze pozywienie w Krakowie

  7. Pingback: darmowe filmiki erotyczne

  8. Pingback: tworzenie sklepów internetowych

  9. Pingback: witryna firmowa

  10. Pingback: program pity 2014 - pity2015program.pl

  11. Pingback: link do strony

  12. Pingback: kliknij

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s