Thursday, December 29, 2005

Wi-Fi Standards

Information Week has a wonderful article on Wi-Fi standards.

Monday, December 26, 2005

Assignment Operator '+='

I had a bug in my code today. After hours of debugging, I found out the silly mistake that I did. The compiler did not identify because it was semantically right. Instead of ‘+=’ I had typed ‘=+’. This silly typo gave me weird results and cost me a lot of time in debugging.

Matlab code to C

1.If you are using a GUI program, you´ll need to check if you have
installed the Matlab C/C++ Library, and Matlab Graphics Library.
2.Suppose your program is called "experiment"; then, your first line
of your m-file must be
%#function experiment (<-exactly like this)
3.Use mbuild -setup
You´ll be asked about your C Compiler (if you dont have one, then
select that Matlab provides you).
4.mcc - B experiment
Will create your executable file and all the libraries need it to
transport to another PC.
5.Read the Matlab C/C++ Graphics help to provide you about more
information.

KnownColor

KnownColor is a very useful enumeration that can generate all the system-defined colors. C# Color Table provides a visual sheet for choosing colors while writing the program.

Friday, December 23, 2005

Nokia 6610 and IRDA - Tweak

Combined with the software, Nokia PC Suite, this is the best cellular phone that I have had. One could add their own polyphonic ring tones, edit them, copy the phonebook, and literally do anything through the infrared connection. The purpose of this article is to provide tips on connecting to the device using IR port. I used BAFO USB to IrDA Adapter on Windows XP Professional. Please refer to Sigmatel for more technical information and troubleshooting tips on Infrared wireless.

I own a Nokia 6610 and always wanted to sync or connect it to my computer. Since Nokia 6610 lacks Bluetooth connection, I had the option of using either a data cable or the IR (InfraRed) connection. Even though it was an easy to connect to the Nokia’s IR, a little tweaking was necessary for stability. Otherwise, the connection was breaking up intermittently.

The following steps might help you attain a stable connection.

1. Download Nokia PC Suite software. It is freely available on Nokia’s website. At the time of writing this article, I used Nokia PC Suite 5.1 for interfacing with Nokia 6610.

  1. Connect the USB dongle. Win XP should recognize the drivers. I observed that the infrared connection was not stable.
  2. Go to Control Panel -> System -> Hardware -> Device Manager -> IR device.
  3. Change the settings for USB dongle as specified below:

Change IRDA Transceiver Type to Sigmatel 4000, Min. Turn-Around Time to 1.0mS (DEFAULT), and Speed Enable to 115200.

This change of settings gave me a stable Infrared connection to the Nokia 6610 and I was able to upload/download images, pictures, phonebook entries etc.

Please leave a comment if you have any other suggestions or if this tip worked for you.

DateIteration

I created a class, DateIteration, a C# class that extends DateTime structure to include time series. It can convert time (iterations) to date and vice-versa.












CATEGORY:

Saturday, December 17, 2005

Flowcharting - Visio vs. PowerPoint

PowerPoint has all the capabilities to create a good flow chart. But, Visio is more elegant.

CATEGORY:

Friday, December 16, 2005

Problem description

I am still stuck with the problem description and algorithm. I have to finalize this part before I start coding.

Thursday, December 15, 2005

Writing the idea

I finished the first draft of my idea. While writing, the original thought tweaked itself optimally. My algorithm seems to be a feasible solution. I will have to implement it and see. I hope to finish it by tomorrow.

My advisor told me that I did not write it clearly. He did not understand. Having a second look at my draft, I think he is right.

Have to write it again.

CATEGORY:

Wednesday, December 14, 2005

Decent Progress

I thought about a new idea for my current problem. Two things are left to be done: Writing and implementing the thought. I finally seem to have overcome my lethargic phase by decomposing the problem in hand.

Saturday, December 03, 2005

Hacking attacks

SANS released the top 20 most-critical Internet security vulnerabilities for 2005.

It shows a troubling change in pattern of hacking attacks. The hacking attacks have changed from targeting operating systems and Internet services on Web serves and E-mail servers during 1999-2004 to applications and network devices’ operating systems.

CATEGORY:

Sunday, November 27, 2005

Cumulative Normal Distribution Function

The cumulative normal distribution function is actually an integral formula. There is no explicit closed form solution. A good approximation can be found at Milton Abramowiz and Irene A Stegun. Handbook of Mathematical Functions. National Bureau of Standards, 1964.

The following code is a java function for calculating the cumulative normal distribution.

public double CNDF(double z)
{
double mean = calculateMean();
double sd = calculateSD();
z = z - mean;
if (z> 6.0) return 1.0;
if (z<-6.0) return 0;
double b1 = 0.31938153;
double b2 = -0.356563782;
double b3 = 1.781477937;
double b4 = -1.821255978;
double b5 = 1.330274429;
double p = 0.2316419;
double c2 = 0.3989423;
double a = Math.abs(z);
double t = 1.0/(1.0 + a*p);
double b = c2 * Math.exp((-z)*(z/2.0));
double n = ((((b5 * t + b4)*t+b3)*t+b2)*t+b1)*t;
n = 1.0 - b*n;
if (z<0.0) n = 1.0 - n;
return n;
}

Normality Testing

There are many algorithms for testing the normality of a data set.
Some of the tests include:

1. Kolmogorov-Smirnov test
2. Lilliefors test
3. Shapiro-Wilks’ W test

Shapiro-Wilks test seems to the best because of its good power properties. But, Kolmogorov-Smirnov test is easier to implement.

Tuesday, November 22, 2005

NSF grant proposal: Organization

I found Expert Opinon's link useful and interesting.

Monday, November 21, 2005

CLASSPATH

I have worked only on C# for the past one year. Today, I got a chance to work on my favorite programming language, Java. I always forget to set the CLASSPATH after I install Java SDK. This solved a lot of issues.

Wednesday, November 16, 2005

Types of output

Binary classification problem - A learning problem with binary outputs
Multi-class classification problem - A learning problem with finite number of categories
Regression - A learning problem with real-valued outputs

Monday, November 14, 2005

Report

The report that I would recommend in a typical NN experiment can be divided into the following sections:

  1. Objective of the experiment
  2. Data description - Preprocessing of the data (if any)
  3. Architecture of the Neural Network used
  4. Pseudo code
  5. Results
  6. Discussion
  7. Conclusion
  8. Matlab code as an appendix

Sunday, November 13, 2005

Matlab script

I used this MATLAB script using Neural Network toolbox. The important features were selected based on a threshold.



clear all;
P = load('train1.txt');
T = load('trainout1.txt');
Ps = load('test1.txt');
Ts = load('testout1.txt');
Net = newff(minmax(P), [300, 200, 150, 1], {'logsig', 'logsig', 'logsig', 'logsig'}, 'trainscg' );
Net.trainParam.epochs = 1000;
Net.trainParam.goal = 0.0001;
[Net_t, TR] = train(Net,P,T);
y = sim(Net_t,Ps);
% figure
x=[1:37]; % to draw the x axis values
efficiency = 1 - (sum(abs(round(y)-Ts)))/37;
efficiency
% plot(x, y, 'bo--', x, Ts, 'r*-');
% legend('network','actual');

% Selection of Features
Threshold = 0.90;
for i=1:300
disp 'Current Iteration'
disp(i)
FP = P;
FPs = Ps;
FP(i,:) = [];
FPs(i,:) = [];
Net = newff(minmax(FP), [299, 200, 150, 1], {'logsig', 'logsig', 'logsig', 'logsig'}, 'trainscg' );
Net.trainParam.epochs = 1000;
Net.trainParam.goal = 0.0001;
[Net_t, TR] = train(Net,FP,T);
y = sim(Net_t,FPs);
efficiency = 1 - (sum(abs(round(y)-Ts)))/37;
disp(efficiency);
if (efficiency < Threshold)
disp(i)
disp(' is an Important Feature');
end
end

Thursday, October 20, 2005

Red-Black Trees

I found an interesting website for object oriented programming and other computer science concepts. I added OOPWEB.com in the links section also.

Cormen, p269 has the pseudocode for inserting a node in the Red-Black Tree. The else part in the pseudocode has been left out. I have made an attempt to write the else part as follows:

If p[x] = right[p[p[x]]]
Then y <- left[p[p[x]]]
If color[y] = RED
Then color[p[x]] <- BLACK
Color[y] <- BLACK
Color[p[p[x]]] <- RED
X <- p[p[x]]
Else if x = left[p[x]]
Then x<- p[x]
RIGHT-ROTATE(T,x)
Color[p[x]] <- BLACK
Color[p[p[x]] <- RED
LEFT-ROTATE(T, p[p[x]])


Thursday, September 08, 2005

Random numbers with normal distribution

It is easy to create them with excel. Tools -> Data Analysis -> Random Number Generation.

Tuesday, September 06, 2005

What is model based design?

Sunday, August 21, 2005

Tuesday, August 09, 2005

Power Trading

I made a copy of chapter 11 from Investment Science. I have to read that chapter to use the wiener and ito's model for creating prices.

Also, I have to incorporate type I, type II, and neural networks into the agents in the software.

Tuesday, August 02, 2005

Alchemi [.NET Grid Computing Framework]

This might be useful for me to test my program in a grid computing environment.

Alchemi [.NET Grid Computing Framework]: "can be behind firewalls and NAT servers.
Only idle CPU time of machines on the grid is used; user programs are not impacted."

Grid computing

One of my friend was asking about grid computing. I suggested him to do the following:

  1. Get some spare computers from GSA.
  2. Set up a grid.
  3. Install Globus toolkit.
  4. Run a set of examples and see what happens.

I asked him to spare 3 hours everyday and set up the basic infrastructure. I explained him to then publicize his grid and encourage other people to execute their applications on his grid. This will help both the parties and will also validate and publicize his project. He can also provide motivation to the undergraduate and graduate students of cross-disciplinary nature.

I hope he succeeds in his endeavors.

This is the email that I sent to him:


See this website:http://www.private-grid.nl/
Some group in Netherlands is providing a 10 machine
grid for free. They are offering people to sign up and run their applications
for free.

This is a very good way to validate your grid and
publicize. This will not only help you academically and politically, but
also boost your confidence when you talk about it. You can start
something similar to this and add manpower, thus enhancing your
managerial and administration skills. This will help you in any of
the long-term goals you might want to pursue.


Most of the schools have an open-source labs. Our
school lacks this one. This might be one of the reasons why you should start one
in our school.


Good Luck!

IEEE Transactions

My paper got published yesterday on the webpage:

This paper appears in: Power Systems, IEEE Transactions onPublication Date: Aug. 2005Volume: 20, Issue: 3On page(s): 1330- 1340ISSN: 0885-8950 Digital Object Identifier: 10.1109/TPWRS.2005.851948Posted online: 2005-08-01 09:45:17.0

Tuesday, July 26, 2005

Todays Job

Earlier before going to work, I had planned that I should write something about the data description of the high-frequency data and also do some experiments with the neural network.

I had to find the daily average price for all the 1826 points. I had to do this to find if there is any correlation between everyday temperature and price. Also, I found out the first-order serial correlation coefficient for hourly prices. I used the "correl" function of excel to do this. More details to be added later.

During the winter break of 2005 (Jan), I created a RBFF network using MATLAB. I created a file named ANN.m. The data used during that time was a simple price and load data. Now, I have more dimensions in the high-frequency data set. So, I modified the old ANN.m file and created ANNNew.m file. The new data has 43824 (1826 x 24) points. The neural network gives an out of memory error when executed as a whole. So, I have to do redo by taking some part of the data set for testing and training.

Took the printout of the tutorial about codeDOM.
I almost have 5 years of data. I have to redo the tests like this: Train with the data in the first year and then test with the data of second year. Simply alternate the training and testing data.

Sunday, July 24, 2005

Genetic Programming

Currently I am reading the c# article on genetic programming. Also, I borrowed John Koza's Genetic Programming: On the Programming of Computers by Means of Natural Selection from the library. This is the same book that the article refers to.

Thursday, July 21, 2005

Genetic Algorithms

I was reading the book "An Introduction to Genetic Algorithms by Melanie Mitchell " from the online bookstore on the IEEE Computer Society website and also reading the user guide of the GA toolbox to start implementing atleast a simple Genetic Algorithm.

Tuesday, July 19, 2005

Open Source Software in C# and Library

http://csharp-source.net/

I went to library to pick up the ILL book Tomorrow's professor : preparing for academic careers in science and engineering . Later, I went to the journals section and was browsing at some of the journals. I found the following:


  1. The journal Applied Stochastic Models in Business & Industry has a special issue on Statistical Learning in March-April 2005 (Vol 21, No. 2) It is available electronically also.
  2. www.kansascityfed.org has some articles in their economic review section which are interesting to read. I skimmed through their article How long is a long term investment?.
  3. MSDN Magazines came up with really nice and interesting articles. Check out http://msdn.microsoft.com/msdnmag
  4. Couple of articles which I thought might interest me in my project:
    Concurrency What Every Dev Must Know About Multithreaded Apps and Winsock Get Closer to the Wire with High-Performance Sockets in .NET. I have printed these articles.

Friday, July 15, 2005

Pattern Matching

I finally figured the algorithm for Pattern Matching of time-series data. It is almost similar to the one by the following paper:

F.-L. Chung, T.-C. Fu, V. Ng, R. W. P. Luk, An evolutionary approach to
pattern-based time series segmentation, IEEE Transactions on Evolutionary
Computation, 8 (5) (2004) 471-489.
I have written the pseudocode on paper and I am trying to implement in c#. The paper does not discuss many implementation issues. But, I agree with the methodology. It is sound and convincing.

Thursday, July 14, 2005

Oscar Wilde, Genetic Algorithms for Matlab

I like this quote by Oscar Wilde:

The true mystery of the world is the visible, not the invisible.

Regarding Genetic algorithms, read this first; then read this full length tutorial. Request a copy of the Genetic Algorithm Toolbox for Matlab from here. I got an email from them with zipped attachment. Decompressed the files into a folder "genetic" inside the "toolbox" folder of MATLAB. Added the path in the matlab path (MATLAB->FILE->SETPATH)

Wednesday, July 13, 2005

Data Processing

Finally, I am done with the processing of all raw data. I have managed to create a single file with the following fields: Date, Day of the Week, Temperature, Hour, Day-ahead Price, Real-time Price, and Load. At this point in time, since the daily temperature of Philadelphia was not available till 30 June 2005, I had to change the range of dates as 1 June 2000 to 31 May 2005.

I used DayofWeek property of the DateTime structure to extract the correct day of week for a given date. Depending on the results, I am thinking of changing the Day of the Week to a binary datatype to hold if it is a weekday (Mon-Fri) or weekend(Sat, Sun).

The data is organized in my desktop as: Raw, Pre-processed and processed data.

Tuesday, July 12, 2005

Data Analysis


Finally, using the c# program, I extracted the required data from the colossal data files present in the pjm website. 2009 data points were chosen from 1 June 2000 to 30 June 2005. The summary of the data files is as below:
The numbers in each cell represent the column numbers in the csv file. The same format was followed for both RT price and DA price. The query string was changed from 'PJM' to 'PJM-RTO' after 1 May 2004.

Friday, July 08, 2005

To Do List

Data pre-processing is an important job.

  1. Download data from the website (www.pjm.com) into a folder on desktop.
  2. Using FileInfo and Directory classes in C#, process all the csv filenames in the folder.
  3. Using the filenames in the above step, parse each file and extract a record satisfying the query (zone). A single record consists of the date and hourly prices for that day.
  4. Get the temperature archive of a chosen city from University of Dayton website.
  5. Categorize the dates into weekend, weekday, and national holiday as much as possible.

Thursday, July 07, 2005

Pre-Processing of Data and Coding

Collected the data. Used a ftp client software to download data from pjm. All data was in the form of zip files. So, I used a batch zip extraction tool (Zipghost). Then, I downloaded JhLib library and used it for parsing csv files.

The following code was used to get the filenames of all the files from a folder:

using System.IO;
string directory = @"D:\Data\Real Time Hourly Market Price";
DirectoryInfo directoryInfo = new DirectoryInfo(directory);
FileInfo[] fileInfo = directoryInfo.GetFiles("*.*");
foreach(FileInfo fi in fileInfo)
{
Console.Write(fi.Name+", ");
}

JHLib

JHLib - Jouni Heikniemi's .NET tool library

I used Jhlib library for parsing csv files. Very useful tool.

PJM data

I want to use hourly data. So, I need load and price for DA and RT markets. After having a preliminary look at the data in the website (www.pjm.com), I have decided to consider only looking for PJM zone. At this time it appears that PJM zone has been changed to PJM-EAST zone.

DA Hourly load

DA Hourly market price
Available at ftp://www.pjm.com/pub/account/lmpda/index.html
From June, 2000 to present
RT Hourly load
Load data is available from 1998-2005
RT hourly market price
Avaialbe at ftp://www.pjm.com/pub/account/lmp/index.html
From Jan, 2000 to present

Wednesday, July 06, 2005

Spatial Data Analysis

After reading the first few pages of "Spatial Data Analysis," I realized that this is not what I was looking for.
According to the author, Spatial means each item of data has a geographical reference so we know where each case occurs on a map.