Monday, October 6, 2008

Lanczos and Java

So the deadline is nearing. After a suggestion from Rafael i decided to look into replacing the current SVD method with the more efficient lanczos.

I figured out quickly that to implement this from scratch would be impossible for me within a two week timeframe, especially since i would need to understand the algorithm in detail.

I started searching for things that would help:
  • Java Matrix packages: there seems to be alot of them out, a popular one is Jama, which is infact the matrix package used by Weka. Others include Matrix Toolkits for Java, Colt and JLAPack. None of these had a implementation of SVD based off the lanczos
  • However there was a code out there that did implement lanczos unforuntantley not in Java. These include SVDPack and PROpack. It seemed that the scientific community still sticks to Fortran, C and Matlab for these matrix based solutions.
  • I did look into calling a C or Matlab program from within Java , however it seemed more difficult than i thought .
  • The Matlab program would require a MatLab runtime enviroment on the host machine in addition it would make the entire system dependent on Matlab.
  • As for the C program the code was difficult to read and wasn't structured very well
  • for the time being it doesn't look like this lanczos implementation can be done, its definitely doable but with only a few weeks left i would need to wrap it up quick so i can write the actual thesis.
With the remaining time i looked into implementing my proposed changes to the actual TML code. One thing i noticed was operations besides SVD that actual cost time that i was not measuring. This added significant amounts of time such that the point where SVD generation only really got bigger than reading SVD at about the 4000-5000 word mark.

I have meeting tommorow as usual and will discuss these issues.

1 comment:

akuhn said...

We ported LIBSVDC to Java, mail if you need the source code (but I guess your deadline as long been passed :)