Planet Python/SoC - 2009 edition

October 15, 2009

Dale Peterson

New rolling torus code

I changed the code to the rolling torus example pretty significantly. Little fixes in how the arrows representing the body fixed unit vectors are now fixed. Additionally, I added some code that calculates the kinetic, potential and total energy of the system, and plots it. The scipy solver seems to do a pretty good (not perfect) job of maintaining constant energy. I think tightening up the error tolerances would improve this, but the simulation is well behaved at this point so I didn’t bother.

October 15, 2009 09:32 PM

October 10, 2009

Tyler Laing

Professor Bends Matter To His Will, Not a Supervillain”

Ed: This was published in the Okanagan Phoenix on Oct 7th, 2009
Last week, I interviewed a UBC-O Engineering professor, Dr. Kenneth Chau, about his recent research. Dr. Chau joined the school of Engineering in January, from the National Institute of Standards and Technology (NIST) in the USA. His particular specialty is nanotechnology. Nanotechnology is a field dealing with technology and objects that are at the nanometer(nm)-scale size. To put it in perspective, a nanometer, which is 1 billionth(1000 million) of a meter, is 1/1000th the width of a human hair and the wavelength of visible light goes from 400-700 nm.

Dr. Chau recently made a significant advancement in the field of nanotechnology, where he demonstrated that light could actually pull a nano-scale object, rather than just push. The implications are very important, both for the field, and eventually for the production of military, scientific, and consumer products.

The kind of materials that Dr. Chau and others in his field work with, named ‘metamaterials’, offer many benefits to military, scientific, and consumer fields. For the military, such materials and devices can create new metal alloys, with potential unique properties, like extreme heat resistance or superior strength. On top of that, metamaterials offer the possibilities of perfect lenses, or perfectly reflective mirrors. In the more futuristic list of possibilities, it is believed that this field of nanotechnology will eventually allow us to build an invisibility cloak, or even optical computers, but both such inventions are far from being created.

All objects can be characterized by a refractive index, or an index of refraction as it is also called. This is the degree to which light is slowed down within the medium. As well, when light crosses the boundary between two mediums with different refractive indexes, light bends. Microscopes and lenses work by bending light in useful ways. All natural mediums have a positive refractive index, meaning that light is slowed down within the medium. However, some metamaterials have a property known as a negative refractive index, where light is bent in the opposite direction than in materials with a positive refractive index.

This is where the radiation pressure of light comes into play. Light has momentum, like any object that is in motion. However, light has a very small momentum, and can only affect small particles. The radiation pressure effect of light is why comets’ tails are always pointed away from the sun; the charged particles are pushed away by the radiation pressure of light. Picture a fire hose being pointed at you. The pressure of the hose pushes you away.

When the radiation pressure of light is combined with a negative index of refraction, Dr. Chau, proved, via experiment, that light can actually exert a pull force, in addition to pushing around particles. It is like some method of making the fire hose pull you, instead of pushing you away. If researchers can construct objects with arbitrary optical properties, then they can manipulate light in arbitrary ways, leading to all of the innovations listed previously, and many more not imagined yet. We have things, previously thought to be only in the realm of science fiction, happening in labs every day, around the world, and even on our campus.

Currently, the Chau Research group Dr. Chau is working on several exciting projects, and require talented and capable students with diverse backgrounds. One project Dr. Chau is involved in is the construction of a computer cluster, also known as a supercomputer, for simulating complex physical phenomenon. Another is a project to build a sensor, capable of detecting contaminants in water, by using spectroscopic analysis of light that comes out of properly formed droplets of water. One possible use for this optical sensor is to detect vanishingly small amounts of impurities in water.

October 10, 2009 05:46 PM

The Microsoft Courier: Better than the font

This was published in the Okanagan Phoenix on October 7th, 2009

On Tuesday, September 22nd, Microsoft released details about a product now in “late development”, the Microsoft Courier. Microsoft calls it a “booklet”, rather than a “tablet”, due to the two touchscreens the Courier offers, with a bendable spine. The Courier also comes with a camera on the back of one half of the booklet, as well as a single Apple-like Home button in the spine, which is used for powering the device on and off as well.

The Courier will be a full-featured computer, running a specialized GUI meant specifically for the Courier. It will have wireless and Microsoft’s famed handwriting recognition, however, there is no indication of USB ports or other items. The killer feature though is the handwriting recognition, which in XP, Vista, and now Windows 7, is beyond fantastic.

There is a video circulating which shows a fantastic, and well-thought out GUI, which is focused on productivity and Getting Stuff Done, rather than the cool, slick, media-focused iPhone and iPod Touch. With the two screens, the handwriting recognition, and the the integration with the OS, the Courier looks like an ideal tool for executives, creative professionals, and students.

Microsoft has not yet detailed what the specs of the Courier will be, and there is no word on battery life.

There is speculation that this is part of a business move to lower sales and interest in Apple’s rumored iTablet. However, the promotional material and the design of the Courier is very distinctive and unique, and seems to be trying to carve a new niche for Microsoft, in the world of the Professional.

The only downside to the Courier, is that it is not out yet, and not before this editor graduates.

October 10, 2009 05:44 PM

Japan’s Love Affair with Droids

Ed: This was published in the Okanagan Phoenix Sept. 23, 2009
We’ve all seen robots and androids like Data from Star Trek, C3PO and R2D2 from Star Wars, as well as real-life robots, like ASIMO from Honda, and they all seem to be coming from Japan lately. There are a wide variety of reasons why robots and androids enjoy such a large popularity in Japan, ranging from cultural to purely economic reasons.

Part of the reason why the Japanese are so fascinated, and even encouraging of robots and androids, is that for the longest time, Japanese and other Asian fiction lacked a common trope that is often seen here in the west; that of the robots rising to crush their so-easily-crushed oppressors. Namely, us. It wasn’t until recently that this trope was seen in Japanese popular culture, and an excellent example is Cashern, a hyper-surreal action movie, where artificial life rises up and kills us all.

Some sociologists theorize that Japan lacks this common trope because industrialization was seen as largely positive for the country, particularly in the aftermath of WWII. In contrast, there was a large amount of social upheaval in western countries during their industrialization periods, where the machines were seen as a distinct threat.

On average, Japan has one of the oldest populations in the industrialized world. In just a few short years, many Japanese will reach the age of 65, and retire, making it so that 1 in every 4 Japanese will be over the age of 65. This is leading to a significant lack of employees, in turn leading to increasing wage costs for companies.

There are two major markets for robots and androids in Japan, one for the elderly, and one for replacing limited employees where possible. The elderly market requires assistants and companions, especially for the elderly that have no family, or whose family doesn’t visit them. There are already a few preliminary models of elderly assistants, and they are selling very, very well. The other market requires robots to replace humans in easy-to-automate jobs, like store greeters, or waitresses in busy restaurants. One particular requirement is that they look and act as human as possible within the limited purview of the job.

The increasing use of robots and androids in Japan has already sparked several major concerns with their use. There are worries that people will begin to prefer talking to and dealing with robots and androids over their human brethren. As well, there are concerns about the material costs of robots and androids, especially maintenance costs, and whether it really would be more cost-effective than the humans that are being replaced. In addition, Japan has signed several environmental treaties, and increasing their high-tech usage will only hurt their compliance with these treaties.

There is also an issue from an economic viewpoint. The desire to replace humans with robots is because the humans are becoming more expensive to employ. This increases the number of people available for jobs, thus driving down the average wage, making it cheaper to hire people again. However, once hiring increases, then people become more expensive to employ again. It is an inherently unstable situation, unless the government steps in to stabilize the see-saw effect.

One thing is for sure, Japan loves their robots, and they are currently the world leader in practical robotics.

October 10, 2009 05:42 PM

Quantum Superpositioning: Like Kids with ADHD, both excited and grounded.

Ed: This was published in the Okanagan Phoenix Sept. 23, 2009
Scientists at the Max-Planck-Institut fur Quantenoptik in Germany(Max Plank Institute for Quantum Optics) are proposing a bold new experiment dealing with quantum super-positioning. What they propose to do is to place a virus into a superposition.

What super-positioning means is when something is in two or more states at the same time. The classic analogy is known as the Schroedinger cat experiment. This is a thought-experiment(meaning its never been performed in real life, but only in the mind) where a cat is placed in a box, where a poison will be released only if a radioactive object decays within the time period inside the box. The decay probability is 50%, so the cat has a 50% chance of being alive, and an equal chance of being dead. The trick, however, is that until observed, the cat is in both states, as in, the cat is both dead and alive. The cat is in a superposition, two states at once.

This is intuitively difficult for many people, including physicists. When the cat is observed, the cat becomes either dead or alive, but not both. This is called collapsing the wave function, where observation of some form(even by a sensor) causes the superposition to resolve to one state or another.

What the researchers want to do is to cool some matter, namely a virus, down to its quantum ground state in a vacuum, until there is no subatomic activity from the virus. Then they zap the virus with a special kind of laser, which causes the virus to both be in an excited state, and a ground state, at the same time; a superposition. Having performed this same experiment with photons, electrons, and even whole molecules, they wish to see if they can make a much larger bit of matter reach a superposition. This will help show if such quantum mechanical effects apply on the macroscopic(large) scale, instead of just the microscopic scale.

The virus they need to use for the experiment needs to have several special qualities, and luckily(for our karmic revenge) the common flu virus fits this bill. In addition the tobacco mosaic virus would also be perfect for the experiment.

October 10, 2009 05:40 PM

Dale Peterson

Gyrostat

I just added an example of use for the derivation of the equations of a gyrostat using PyDy.  It is located in the examples/gyrostat/ subdirectory.

In case you aren’t familiar, here is the definition of a gyrostat:

A gyrostat is a mechanical system which is comprised of more than one body and yet has the rigid body property that its inertia components are time independent constants.

The reason I became interested in gyrostats is because in modeling the bicycle, the rear frame and the rear wheel can be treated as a gyrostat, and in doing so, the number of parameters that appear in the equations of motion will be reduced by two.  The same can be done for the front fork and handlebar assembly and front wheel.

In addition to the Python script that generates the equations, I took the time to do a complete write up of the model in LaTeX and generate a nice pdf of it.

October 10, 2009 04:20 AM

October 07, 2009

Siddhant Goel

siddhantgoel


I’ve always been a hopper. As in like, I’ve always kept on hopping between stuff. When I say stuff, I mean a Linux distribution, and things like that. I still haven’t bought a Macbook, so these tiny little annoying things are going to bother me for quite some time.

I cannot use Windows for any work that I intend to do. Even if I try to, I cant. Setting up things are just plain pain. Plus every other tool I need to work with almost always doesn’t support Windows. So I cannot use it. I *need* a Linux distribution to work on. I started with RedHat linux, then moved to Fedora for quite some time. Then I moved to Ubuntu. Then I tried my hands on Debian. Then I gathered some guts and tried Gentoo (which I used for one full week, before I got pissed off by its everything-compiles nature). Then came Archlinux, and *that* put an end to all my distro hops. But that’s an altogether different story (maybe a different blog post on that).

But this post deals with the DE, rather than the distribution. I’ve always been a GNOME user, until a few months ago when I decided I should switch to something else. So I installed XFCE, and quite liked it. It was stable, fast, and gave me almost everything I could ask for in a DE. But then there were some problems with the xfce4-panel package; it kept on crashing on startup (and at times at random). I searched on forums, but didn’t find anything. I should have bugged some good folks at Archlinux/XFCE, but then decided I should try something else. So I went ahead and installed LXDE, which again, suited me fine. In fact, I’ve been using it as my main environment for quite some time now.

But now, I think I should go back to either of GNOME or KDE. There are tiny little annoying things that bug me too often in light weight DE’s. I probably wouldn’t mind spending those extra machine resources on a full blown DE, as long as I don’t have to tinker around to make things work. I’ve probably done enough of that. I need something usable. Now until and unless I don’t have a job, I wont  buy a Macbook. So until then, its either one of GNOME or KDE. GNOME scares me. The last GNOME version I used was 2.26, and it took horrible, really horrible times to move from the login screen to the main desktop. I have some not so bad memories of using KDE from an internship. I’d probably give KDE a whirl this time. The latest KDE version looks awesome!

October 07, 2009 09:13 PM

September 29, 2009

Siddhant Goel

siddhantgoel


There is this thing about interviews. This weird thing. That I’m just a little too stumped in them. I know I know the answers. And I know the questions I have been asked aren’t difficult at all. They just need a concentrated effort for like 2 or 3 minutes, and you can come up with a solution. Of course that might not be an O(1) solution, but still coming up even with an O(n^2) algorithm should be decent for a start. You can always have a discussion with the interviewer (provided he is nice; both the interviews I’ve been through were with pretty nice interviewers), and then improve upon the solution. And the whole activity is fun. Optimizing stuff, removing unnecessary executions, cleaning up the code, is really a fun activity.

So interviews are basically fun. Until you put me in the room, that is. Maybe the reason for this whole thing is that I’ve given just 2 interviews, but I’m usually a little nervous, not being able to think when someone is watching each and every step of mine. Both the interviews were with the best companies in India, and just thinking that this could actually screw my chances of working with some of the best minds, irks me.

For instance, the interviewer asked me to reverse a linked list. Now I’ve done this question like a hundred times before, and I know the solution, which is nothing more than trivial. But, at that time, all I did was getting lost between arranging the pointers. The function to reverse a linked list is no more than 3 statements (a recursive one), but for some strange reason, I couldn’t perform it at that time. Sad.

Just for my own sake, I’ll write down the solution here. I hope the solution is correct.

void reverse (struct node *start) {
if (start->next == NULL) return;
reverse(start->next);
start->next->next = start;
}

No more than 3 statements in the entire function (I’m assuming it does the job). Nothing more than trivial. But there is this really weird thing about interviews.

Maybe its just the fact that I’ve given only 2 interviews. I need to bring my brain to the work-while-some-is-watching-you mode. Lets see how long it takes. I hope I don’t screw up any more interviews.

September 29, 2009 05:07 PM

September 24, 2009

Werner Laurensse

GSOC meeting @ HSB

The coming Tuesday (29.09.09) there will be meeting of all the GSOC’ders in the Benelux. This meeting will be at the HSB in Brussels during TechTue25 (more info at HSB). There will be a lot of other people from outside of GSOC so you are always welcome if you would like to connect with technical minded people and share ideas.

PS Bring you GSOC-tshirt if it arives on time!

September 24, 2009 10:56 PM

September 12, 2009

Tyler Laing

Alan Turing: One badass dude

(This was originally published in the UBC-O Phoenix student newspaper on Thursday September 10th, 2009, written by me)

There is a debate raging in the halls of Britain’s government, across the internet, and across many people’s minds. Right now, there is a growing movement for the British government to apologize to Alan Turing, posthumously, for their, to modern-day sensibilities, atrocious treatment.

Alan Turing is regarded by many to be the father of modern computing. His work formalized the concept of the algorithm(a way of solving a problem, like how to do long-division without a calculator) and many other things. One of the most important facets of his work was on that of the “halting problem”, which asks if there is an algorithm that can determine if another algorithm will ever complete, given a set of inputs. He was able to show that there is no such algorithm, that will work with every algorithm and set of inputs, in a fairly long and involved proof. He also developed the notion of a programmable machine(the Turing machine), able to emulate any other machine, only as long as that machine can emulate a Turing machine.

All of modern computing, from Facebook, to Halo, owes Turing a debt of gratitude for his work and efforts in designing computers, formalizing many of the important mathematics underlying modern computers, and demonstrating their use during WWII. For a time, Turing was responsible for the unit(known as Hut 8) deciphering Nazi naval communications, and was responsible for saving many lives during WWII. He also designed the bombe, which was the key tool to breaking Enigma, the Nazi cipher. And he did all this before the age of 42.

Because it was at the age of 42 when Alan Turing died, of suicide. Two years previous, in 1952, Turing was outed as a homosexual, and under the laws of Britain at the time, tried and found guilty of being a homosexual. The punishment: castration. Turing was found guilty of the same laws that Oscar Wilde suffered under, only Turing chose to be chemically castrated, rather than go to jail.

These laws have since been repealed in Britain, and are now against the UN Charter of Human Rights, and against the EU constitution. What is happening right now, is that people want the government of Britain to acknowledge wrongdoing in their actions, not just against Turing, but against anyone that suffered under those laws, many of whom are still alive today. Over 6000 signatures have already been added to the petition regarding the apology at the time this article went to press.

Editors note: The British Government, on the same date of publication, offered up an apology to Alan Turing and the thousands of other gay men wrongly punished under this law.

September 12, 2009 10:38 PM

E-textbooks vs Textbooks

(This was published in the UBC-Okanagan Phoenix student newspaper on Thursday, September 10th)
When school starts, during the initial rush at the bookstore, some people may notice a new form of textbooks. This is the “ebook” or the electronic book. The ebooks that we will see in the UBC-O bookstore are offered by a company called Coursesmart(http://www.coursesmart.com), which specializes in offering ebooks for textbooks. They’ve been around since 2007, and have over 6000 textbooks available.

E-textbooks provide several advantages over paper textbooks. For example, you do not have to carry around a 500 page, 3 pound textbook, in addition to a laptop and other matériel. Finally, our backs can breathe easy. In addition, the e-textbook saves on gasoline, since it doesn’t need to be shipped. As well, the e-textbook can save you significant money, always important to starving students, with the e-textbooks selling for half the price of the paper textbook. Coursesmart provides extensive tools to make their e-textbooks that much more useful, like search, go to page X, notes you can add to any point in the book, highlighting any section you wish, or undoing the highlighting, copy, paste, and even printing pages out on demand. These are certainly powerful tools, that many students have wished they had with the paper textbooks.

However, there are certain limitations to the system. You can only download your e-textbook to one computer, and there is little indication of the process for getting another book through Coursesmart if your computer hardware fries. On the other hand, you can choose to access the e-textbook through Coursesmart’s website, and have it available on any computer, as long as you have a supported browser. In addition, the e-textbook will only be available for 180 days, which is certainly long enough to outlast your classes.

To obtain and use e-textbooks, you must first check if the professor for your course is okay with the students using e-textbooks. If so, the book for your course will also have a second tag, listing an e-textbook for purchase. Then you inform the cashier you are buying the e-textbook, and follow the cashier’s instructions. You will receive a receipt with a code on it, and a URL to visit. This URL, at press-time, was http://www.coursesmart.com/redemption?coupon= where you place the PIN after the equal sign.

Follow the instructions at Coursesmart, and choose whether to download the e-textbook, or access it online through Coursesmart’s website. You cannot choose both. Downloading the e-textbook will require the download and installation of Coursesmart’s Bookshelf software. Both downloading and accessing the book through Coursesmart’s website offer the exact same functionality.

The other process is to sign up with Coursesmart directly, find the book for your course, and buy it through them.

There is also a limitation on printing with Coursesmart’s e-textbook. You can only print ten(10) pages at any one time, for a maximum of 150% of the total pages in the textbook. So if the textbook has 200 pages, you can print a total of 300 pages. Coursesmart acknowledges there may be a bug that occurs sometimes with printing, where a user will run out of their allowed pages, at which point, you can contact Coursesmart’s customer service to enable more pages to print.

Coursesmart’s e-textbooks provide a cheap, planet-conscious, weight-less alternative to expensive, dead-tree, and heavy textbooks. This, obviously, comes with some restrictions, which depending on the person can be reasonable, or completely unreasonable. Itis up to the students to decide what is the best option for them. It can certainly be useful to buy an e-textbook for a course that isn’t one’s major, but is still required by the university.

September 12, 2009 10:36 PM

An economic look at DRM

This morning, I saw a slashdot article, this one, where an indie game developer mentions the free-rider problem in regards to DRM. The problem is, there is a significant misunderstanding of the free-rider problem and how DRM deals with it(hint, it doesn’t).

The free-rider problem is an issue in economics and game theory, where someone is able to get a free-ride off of the effort of others. One example is where you have two trappers. Now, if both trappers work hard on their own traps, they’ll each come out ahead at about three resources each. However, if one trapper decides to poach the other trapper’s lines, then that trapper gets six resources. If they both poach, neither gets anything. Typical self-interest says that it is better to poach than to trap, as there is a possibility of getting more resources, for less effort. However, in terms of the collective interests of both trappers, its better if neither poaches. Because there is a very real risk that if both trappers poach, then neither gets anything, and this is an extremely negative outcome. Usually, people spend the effort, or the cost to get the benefits because of the risks associated with free-riding(like jail-time for theft and fraud).

So in terms of piracy, if everyone free-rides, then everyone loses in the end. This is fairly understandable to all pirates, regardless of reason. However, free-riders exist in any system where there will be a possibility of free-riding. Theft will always exist. Fraud will always exist. The incentives and motivations are too great. The question is, how much do the free-riders cost you, and if everyone that pirates really is a free-rider?

One of the central fallacies used by DRM proponents is that every act of piracy, is a lost sale. This is, frankly, wrong. Its a very complex situation, but it can be broken down. Lets consider a customer and an artist. The artist produces a work that is either good or bad. The artist only profits when someone buys a work. A customer, however, may have money, or may not have money now(student for example). So if a customer has money, and they choose to buy, there is basically two outcomes: where the work is good, and both benefit, or where the work is bad, and only the artist benefits. It can be hard to determine if something is worth buying these days with such varied tastes, and such varied levels of quality. Now, imagine if the customer with money pirates instead. There are three outcomes here. One where if the work is good, the customer will buy it or something else from the artist, and thus, both profit. If the work is bad, the customer will not buy anything, and thus saves their money. And finally, the customer doesn’t care, and doesn’t buy anything from the artist either way. This is where a lost sale happens. Not when the customer pirates, but when it doesn’t matter what the quality of the work is, they will still pirate the work.

There is also the alternate side, where a customer doesn’t have money now. This is what artists should be concerned about. Lets say the poor customer doesn’t pirate. They have no money, so they can’t do a lot of social activities, and so they basically end up bored. There are of course libraries and that, so we’ve changed the behavior of our customer to that of someone that contributes nothing negative or positive to the situation. However, if the poor customer pirates instead, there are four possible outcomes. The first is where the work is good, and the poor customer saves up/earns the money to buy from the artist. Both profit in that instance. Or, the poor customer spreads the word of mouth about the product, and gets others, with money, to buy it. This is potentially a situation where the artist can gather many new customers, because word of mouth is hard to quantify at exactly how much of a benefit the artist will gain. Then there is the instance where the poor customer discovers the product is bad. The poor customer profits, in that they don’t waste theirs or others money on a product not worth the money. And finally, the instance where it doesn’t matter, the poor customer will pirate anyways, and there will be no profit for the artist.

This can all be summed up by a simple picture:

Nash Equilibrium of Piracy

Nash Equilibrium of Piracy

Basically, the artist fails to understand the motivations of the customer, namely that they don’t want to waste money on crappy works. So its to the customer’s benefit to pirate, even if they have the money, as the risks are minimized for them. For the artist, its to their benefit to produce something good, and worthy of the money. So unfortunately, what DRM does is it messes up this equilibrium, where customers end up spending money on products they want to buy, and the artists have a good incentive to produce good works. DRM forces customers to waste money, and the artist has spent a significant amount of money doing this. The costs outweigh the benefits, which to be frank, were dubious in the first place. The artist spends money and effort on a DRM system which is easily circumvented, as long as someone finds it worthwhile to do so.

This is done for a variety of reasons, ranging from the invasiveness of the DRM, to just the technical challenge of doing so. There is also a possibility that the DRM makes the product worse, and the pirates wish to work around this. By placing DRM in the way, the artist creates an incentive to break it. So either the DRM doesn’t get in the way, and works perfectly, or don’t use DRM.

Essentially, the way to deal with piracy isn’t DRM, but in making good stuff. If you make good stuff, and don’t place technical barriers in the way, you create a strong incentive for customers to pay for your work. Even better, instead of letting people evaluate the works via piracy, provide representative samples, that are constantly changed, 100% free, and good quality. An example would be providing two songs from an album, along with short samples of the other songs. This helps the customer see that it is worth the money on the album, especially if buying each song individually is more than buying the album. Or, just sell the songs individually.

September 12, 2009 02:59 PM

September 11, 2009

M. Shuaib Khan

GSoC ends, but work goes on

Well GSoC came to an end. Thanks to my supportive mentor, Seth Lemons, for grading my work as satisfactorily meeting the intial requirements, and thus making my GSoC work a success.


But the work doesn't come to an end here. I plan on keeping this blog alive. Though got really busy for the past couple of weeks, with college coming to an end, project thesis presentations, shifting to a new place, setting up the new place... it was pretty much a mess.

Now, I've plans to work on my fork of figleaf. I made the C and Python report integration work properly, but the tool needs to be made user friendly. I still haven't submitted the patches for the test suite of Py3k that I wrote, need to get feedback from devs on that, and write more tests in the meanwhile.

GSoC was a great gateway into the world of core Python development, and I plan to make a good use of it, for the long term.

Keep expecting updates.

September 11, 2009 01:47 PM

September 09, 2009

Werner Laurensse

Belgium media and police still not entirely sure just how the internetz work...

I was quite shocked this morning when I read on deredactie.be about a terror alert aimed at a local school in the Belgium city “Mechelen”. A total of 333 police agents where dispatched, 9 arrest where made and there was even a helicopter circling the airspace in search of possible killers.

Only later this day I learned that the source of all this commotion, was a message posted on 4chan.org saying: “293 days to go until I strike at KTA Lyceum Mechelen, watch the news”.

That’s correct, 4chan the internet community famous for several internet memes like Rick’Roll, Chocolate Rain, the Sarah Palin email hack, spamming YouTube with porn, claiming Steve Job died of a heart attack, spamming the swastika symbol into Google Hot Trends, the Project Chanology, the pedo bear, threatening to bomb several football stadiums with dirty bombs, posting pictures of mock pipe bombs, and several other ‘joke’ threads.

Well on this 4chan someone threatens to harm a school almost 9 months after the posting the actual message on a site that could be considered the social dumpster of the internet. The only thing this overreaction the can ensure is just more jokes of 4chan….

September 09, 2009 03:32 PM

September 07, 2009

Aaron Meurer

asmeurer


Sorry about the extreme delay with this. I of course have been busy with classes.

Note that this will just be a summary of the summer, with my comments looking back on it. If you want more details on each individual thing that I implemented, look back on my previous blog posts.

Let me start from the beginning. Around late February to early March of this year, I discovered the existence of Google Summer of Code. I knew that I wanted to do some kind of work this summer, preferably an internship, so it piqued my interest. At that time, the mentoring organizations were still applying for GSoC 2009, so I could only look at the ones from 2008. Most of them were either Linux things or Web things, neither of which I had any experience in or am I much interested in. I took a free course in Python at my University the previous semester, and it was the programming language that I knew best at the time. I had learned some Java in my first semester CS class (did I mention that this was my first year at college?), and I hated it, and I was still learning C for my second semester CS class. So I looked at what the Python Foundation had to offer. I am a double major in math and computer science, so I looked under the math/science heading. That’s when I saw SymPy.

I should not that I have been ahead in Math. It was my second semester, and I was taking Discrete Mathematics, Ordinary Differential Equations, Basic Concepts of Math, and Vector Analysis. So I looked for project ideas on the SymPy page that related to what I knew. The only one that I saw, other than core improvements, was to improve the ODE solving capabilities. I got into contact with the community and looked at the source, finding that it was only capable of solving 1st order linear equations and some special cases of 2nd order linear homogeneous equations with constant coefficients. I already at that point knew several methods from my ODE course, and I knew much of what I would learn.

Application Period
The most difficult part of the Google Summer of Code, in my opinion, is the application period. For starters, you have to do it while you are still in classes, so you pretty much have to do it in your free time. Also, if you have never applied for Google Summer of Code before, you do not really know what a good application should look like. I have long had my application available on the SymPy Wiki, and I will reference it here a few times. First off, it was recommended to me by some of the SymPy developers that I put as many potential things that I could do in the summer in my application as I though I could do. I was still only about half way through my ODEs course when I wrote the application, but I had the syllabus so I knew the methods I would be learning at least by name. So that is exactly what I did: I packed my application with every possible thing that I knew we would be learning about in ODEs.

After I felt that I had a strong application, and Ondrej had proofread it for me, I submitted it. There were actually two identical applications, one for the Python Software Foundation, and one for Portland State University. This is because SymPy was not accepted as a mentoring organization directly, so they had to use those two foundations as proxies. A requirement of acceptance is to submit a patch that passes review. I decided to add a Bernoulli solver, because Bernoulli can be solved in the general case much like the 1st order linear ODE, which was already implemented.

After I applied, there was an acceptance period. I used that period to become aquatinted with the SymPy community and code base. A good way to do this is to try to fix EasyToFix issues. I found issue 694, which is to implement a bunch of tests from a paper by Michael Wester for testing computer algebra systems. The tests cover every possible thing that a full featured CAS could do, so it was a great way to learn SymPy. The issue is still unfinished, so working on it is still a good way to learn how to use SymPy.

Also, it was important to learn git, SymPy’s version control system. The learning curve it pretty steep if you have never used version control system before, but once you can use it, it becomes a great tool at your disposal.

Acceptance
After being accepted, I toned down my work with SymPy to work on finishing up my classes. My classes finished a few weeks before the official start, so I used that period to get a jump start on my project.

The GSoC Period
For the start of the period, I followed my timeline. I implemented 1st order ODEs with homogeneous coefficients and 1st order exact ODEs. These were both pretty simple to do, as I expected.

The next thing I wanted to do was separable. My goal at this point was to get every relevant exercise from my textbook to work with my solvers. One of the exercises from my book (Pg. 56, No. 21) was dy=e^{x + y}dx. I soon discovered that it was impossible for SymPy to separate e^{x + y} \rightarrow e^{x}e^{y}, because the second would be automatically combined in the core. I also discovered that expand(), which should have been the function to split that, expanded using all possible methods indiscriminately. Part of my separatevars() function that I was writing to separate variables in expressions would be to split things like x + yx \rightarrow x(1 + y) and 2 x y + x^{2} + y^{2} \rightarrow (x + y)^{2}, but expand()
as it was currently written would expand those.

So I spent a few weeks hacking on the core to make it not auto-combine exponents. I came up with a rule that exponents would only auto-combine if they had the same term minus the coefficient, the same rule that Add uses to determine what terms should auto combined by addition. So it would combine e^{x}e^{x} \rightarrow e^{2x}, but e^{x}e^{y} would be left alone. It turns out that some of our algorithms, namely the Gruntz limit algorithm, relies on auto-combining. We already had a function that could combine exponents, powsimp(), but it also combined bases, as in x^zy^z \rightarrow (xy)^z, so I had to split the behavior so that it could act only as auto-combining once did (by the way, use powsimp(expr, combine='exp', deep=True) to do this). Then, after some help from Ondrej on pinpointing the exact location of the bugs, I just applied the function there. The last thing I did here was to split the behavior of expand, so that you could do expand(x*(y + 1), mul=False) and it would leave it alone, but expand(exp(x + y), mul=False) would return exp(x)*exp(y). This split behavior turned out to be useful in more than one place in my code.

This was the first non bug fix patch of mine that was pushed in to SymPy, and at the time of this writing, it is the last major one in the latest stable version. It took some major rebasing to get my convoluted commit history ready for submission, and it was during this phase that I git finally clicked for me, especial the git rebase command. This work took a few weeks from my ODEs time, and it became clear that I would not be doing every possible thing from my application. The reason that I included so much in my application was that my project was non-atomic. I could implement a little or a lot and still have a working useful module.

If you look at my timeline on my application, you can see that the first half is symbolic methods, and the second half is other methods, things like series. It turns out that we didn’t really learn much about systems of ODEs in my course and we learned very little about numerical methods (and it would take much more to know how to implement them). We did learn series methods, but they were so annoying to do that I came to hate them with a passion. So I decided to just focus on symbolic methods, which were my favorite anyway. My goal was to implement as many as I could.

After I finished up separable equations, I came up with an idea that I did not have during the application period. dsolve() was becoming cluttered fast with all of my solution methods. The way that it worked was that it took an ODE and it tried to match methods one by one until it found one that worked, which it then used. This had some drawbacks. First, as I mentioned, the code was very cluttered. Second, the ODEs methods would have to be applied in a predetermined order. There are several ODEs that match more than one method. For example, 2xy + (x^2 + y^2)\frac{dy}{dx}=0 has coefficients that are both homogeneous of order 2, and is also exact, so it can be solved by either method. The two solvers return differently formatted solutions for each one. A simpler example is that 1st order ODEs with homogeneous coefficients can be solved in two different ways. My working solution was to try them both and then apply some heuristics to return the simplest one. But sometimes, one way would create an impossible integral that would hand the integration engine. And it made debugging the two solvers more difficult because I had to override my heuristic. This also touches on the third point. Sometimes the solution to an ODE can only be represented in the form of an unevaluatable integral. SymPy’s integrate() function is supposed to return an unevaluated Integral class if it cannot do it, but all too often it will just hang forever.

The solution I came up with was to rewrite dsolve using a hints method. I would create a new function called classify_ode() that would do all of the ODE classification, removing it from the solving code. By default, dsolve would still use a predetermined order of matching methods. But you could override it by passing a “hint” to dsolve for any matching method, and it would apply that method. There would also be options to only return unevaluated integrals when applicable.

I ended up doing this and more (see the docstrings for classify_ode() and dsolve() in the current git master branch), but before I could I needed to clean up some things. I needed to rewrite all of dsolve() and related functions. Before I started the program, there were some special cases in dsolve for second order linear homogeneous ODEs with constant coefficients and one very special case ODE for the expanded form of \frac{d^2}{dx^2}(xe^{-y}) = 0.

So the first thing I did was implement a solver for the general homogeneous linear with constant coefficients case. These are rather simple to do: you just find the roots of the characteristic polynomial built off of the coefficients, and then put the real parts of the roots in front of the argument of an exponential and the imaginary parts in front of the arguments of a sine and cosine (for example, 3 \pm 2i would give C1e^{3x}\sin{2x} + C2e^{3x}\cos{2x}. The thing was, that if the imaginary part is 0, then you only have 1 arbitrary constant on the exponential, but if it is non-zero, you get 2, one for each trig function. The rest falls out nicely if you plug 0 in for b into $e^{ax}(C1\sin{bx} + C2\cos{box})$ because the sine goes to 0 and the cosine becomes 1. But you would end up with C1 + C2 instead of just C1 in that case. I had already planned on doing arbitrary constant simplification as part of my project, so I figured I would put this on hold and do that first. Then, once that was done, the homogeneous case would be reduced to 1 case instead of the usual 2 or 3.

My original plan was to make an arbitrary constant type that automatically simplified itself. So, for example, if you entered C1 + 2 + x with C1 an arbitrary constant, it would reduce to just C1 + x. I worked with Ondrej, including visiting him in Los Alamos, and we build up a class that worked. The problem was that, in order to have auto-simplification, I had to write the simplification directly into the core. Neither of us liked this, so we worked a little bit on a basic core that would allow auto-simplification to be written directly in the classes instead of in the Mul.flatten() and Add.flatten() methods. It turns out that my constant class isn’t the only thing that would benefit from this. Things like the order class (O(x)) and the infinity class (oo) are auto-simplified in the core, and things could be much cleaner if they happened in the classes themselves. Unfortunately, modifying the core like this is not something that can happen overnight or even in a few weeks. For one thing, it needed to wait until we had the new assumptions system, which was another Google Summer of Code project running parallel to my own. So we decided to shelf the idea.

I still wanted constant simplification, so I settled with writing a function that could do it instead. There were some downsides to this. Making the function as general as the classes might have been would have been far too much work, so I settled on making it an internal-only function that only worked on symbols named C1, C2, etc. Also, unlike writing the simplification straight into Mul.flatten() which was as simple as removing any terms that were not dependent on x, writing a function that parsed an expression and simplified it was considerably harder to write. I managed to churn out something that worked, and so I was ready to finish up the solver I had started a few paragraphs ago.

After I finished that, I still needed to maintain the ability to solve that special case ODE. Apparently, it is an ODE that you would get somewhere in deriving something about relativity, because it was in the relativity.py example file. I used Maple’s excellent odeanalyser() function (this is where I go the idea for my classify_ode())to find a simple general case ODE that it fit (Liouville ODE). After I finished this, I was ready to start working on the hints engine.

It took me about a week to move all classification code into classify_ode(), move all solvers into individual functions, separate simplification code into yet other functions, and tie it all together in dsolve(). In the end, the model worked very well. The modularization allowed me to do some other things that I had not considered, such as creating a special “best” hint that used some heuristics that I originally developed for first order homogeneous which always has two possible solutions to try to give the best formatted solution for any ODE that has more than one possible solution method. It also made debugging individual methods much easier, because I could just use the built in hint calls in dsolve() instead of commenting out lines of code in the source.

This was good, because there was one more method that I wanted to implement. I wanted to be able to solve the inhomogeneous case of a nth order linear ode with constant coefficients. This can be done in the general case using the method of variation of parameters. It was quite simple to set up variation of parameters up in the code. You only have to set up a system of integrals using the Wronskian of the general solutions. It would usually be a very poor choice of a method if you were trying to solve an ODE by hand because taking the Wronskian and computing n integrals is a lot of work. But for a CAS, the work is already there. I just have to set up the integrals.

It soon became clear that even though, in theory, the method of variation of parameters can solve any ODE of this type, in practice, it does not always work so well in SymPy. This is because SymPy have very poor simplification, especially trigonometric simplification, so sometimes there would be a trigonometric Wronskian that would be identically equal to some constant, but it could only simplify it to some very large rational function of sines and cosines. When these were passed to the integral engine, it would cause it to fail, because it could not find the integral for such a seemingly complex expression.

In addition, taking Wronskians, simplifying them, and then taking n integrals is a lot of work as I said, and even when SymPy could do it, it took a long time. There is another method for solving these types of equations called undetermined coefficients that does not require integration. It only works on a class of ODEs where the right hand side of the ODE is a simple combination of sines, cosines, exponentials, and polynomials in x. It turns out that these kinds of functions are common anyway, so most ODEs of this type that you would encounter could be solved with this method. Unlike variation of parameters, undetermined coefficients requires considerable setup, including checking for different cases. This would be the method that you would want to use if you had to solve the ODE by hand because, even with all the setup, it only requires solving a system of linear equations vs. solving n integrals with variation of parameters, but for a CAS, it is the setup that matters, so this was a difficult prospect.

I spent the last couple of weeks writing up the necessary algorithms to setup the required system of linear equations and handling the different cases. After I finally worked out all of the bugs, I ran some profiling against my variation of parameters solver. It turned out that for ODEs that had trigonometric solutions (which take longer to simplify), my undetermined coefficients solver was an order of magnitude faster than the variation of parameters solver (and that is just for the ODEs that the variation of parameters engine could even solve at all). For ODEs that only had exponentials, it was still 2-4 times faster.

I finished off the summer by writing extensive documentation for all of my solvers and functions. Hopefully someone who uses SymPy to solve ODEs can learn something about ODE solving methods as well as how to use the function I wrote when they read my documentation.

Post-GSoC
I plan on continuing development with SymPy now that the Google Summer of Code period is over. SymPy is an amazing project, mixing Python and Computer Algebra, and I want to help it grow. I may even apply again in a future year to implement some other thing in SymPy, or maybe apply as a mentor for SymPy to help someone else improve it.

Advice
What follows is some general advice for someone who wants to apply for Google Summer of Code. Some of the advice pertains specifically to SymPy, and some of it is general advice that I think would apply to any project.

- Get involved early. As soon as you decide that you want to participate in Google Summer of Code, start getting involved in the project. Get into contact with them and discuss possible projects. If you are looking before the participating organizations are announced, look at the organizations from previous years. For some organizations, it will vary; for others (like Python), it is almost given that they will be accepted every year.

- Some projects (including SymPy) require you to send in a patch that passes review to be accepted. This will give you a change to start familiarizing yourself with the code base. If you are applying to SymPy, the Wester example I mentioned above is a really good way to learn what SymPy can do and how it works.

- Subscribe to the mailing list, and once you are comfortable with it, participate. Also, it is a good idea to idle in IRC (SymPy is on freenode at #sympy). This will help you get to know the main contributors for the project.

- For you application, see if the people in the project you are applying for will review it. If they like your project idea, they will try to help you write a good application so you can be accepted and you can implement it. If they don’t like your idea, then they will tell you and you should change it, otherwise you will not be accepted, no matter how well written your proposal is. I have my proposal on the wiki (see link above). I am not saying that it is necessarily a very good proposal, but it did get accepted. If you are applying to SymPy, Ondrej will proofread your applications for you.

- If you are an IRC fan, there is also #gsoc on freenode, where you can ask all your GSoC related questions. Be warned that it does get pretty noisy in the application period, especially right before the applications are due and right before proposals are accepted.

- I cannot stress this one enough. If you have never worked with a version control system before, it is perhaps more important to spend your time learning it than it is to learn the code base for your project. These things have a steep learning curve if you have never used them before. Once you master them though, they can make your life much easier. Also, the sooner you learn to use them well, the easier your life will be later on down the road. I spent a good part of the last week of GSoC cleaning up my commit history from the first half of the summer when I bad very poor committing/log habits. If your project uses git, such as SymPy does, you might look at this tutorial. If it uses something else, good luck. Seriously, git is the only good version control system. See this video.

- Expect to spend only about half of the summer actually implementing stuff. You may think that you are a good programmer and that your code will not be so buggy that you will need to spend that much time fixing bugs, and you may be right. But the fact is, you will be working on code bases written by may programmers that are not so good. You will need to fix several already existing bugs to make your code work, which means that you will need to learn the code base well, learn how to read other people’s code, and how to fix bugs that you had no part in creating. You will be glad if a bug is in your code because you will usually know immediately what causes it and how to fix it. But if a bug is somewhere else, you will need to find it, figure out why it happens, what is supposed to happen, and how to fix it without breaking anything else. This is also why it is important to be active in the developer community.

- Good luck.

September 07, 2009 06:49 AM

September 05, 2009

Fabian Pedregosa

Summer of Code is over

Google Summer of Code program is officially over. It has been four months of intense work, exciting benchmarks and patch reviewing. It was a huge pleasure working with you guys!

As for the project, I implemented a complete logic module and then an assumption system for sympy (sympy.logic, sympy.assumptions, sympy.queries). I even had time to make the logic module fast. On top of this, there’s the refine module. It is there where you can see some nice examples and where all the power of sympy.queries and sympy.logic is exposed.

Although this sounds good, there are some things that I did not complete on time. I could not remove the old assumption system. There are simply too many things that depend on this to remove it on one move. However, I agreed with Ondrej that we both would be working on this the days 15-30 September. This has to be done because we definitely do not want to make a sympy release with two different assumption systems!

PD: a more detailed report lives here

September 05, 2009 10:21 AM

August 31, 2009

Skipper Seabold

scikits.statsmodels Release Announcement

We have been working hard to get a release ready for general consumption for the statsmodels code. Well, we're happy to announce that a (very) beta release is ready.

Background
==========

The statsmodels code was started by Jonathan Taylor and was formerly included as part of scipy. It was taken up to be tested, corrected, and extended as part of the Google Summer of Code 2009.

What it is
==========

We are now releasing the efforts of the last few months under the scikits namespace as scikits.statsmodels. Statsmodels is a pure python package that requires numpy and scipy. It offers a convenient interface for fitting parameterized statistical models with growing support for displaying univariate and multivariate summary statistics, regression summaries, and (postestimation) statistical tests.

Main Feautures
==============

* regression: Generalized least squares (including weighted least squares and least squares with autoregressive errors), ordinary least squares.
* glm: Generalized linear models with support for all of the one-parameter exponential family distributions.
* rlm: Robust linear models with support for several M-estimators.
* datasets: Datasets to be distributed and used for examples and in testing.

There is also a sandbox which contains code for generalized additive models (untested), mixed effects models, cox proportional hazards model (both are untested and still dependent on the nipy formula framework), generating descriptive statistics, and printing table output to ascii, latex, and html. None of this code is considered "production ready".

Where to get it
===============

Development branches will be on LaunchPad. This is where to go to get the most up to date code in the trunk branch. Experimental code will also be hosted here in different branches.

https://code.launchpad.net/statsmodels

Source download of stable tags will be on SourceForge.

https://sourceforge.net/projects/statsmodels/

or

PyPi: http://pypi.python.org/pypi/scikits.statsmodels/

License
=======

Simplified BSD

Documentation
=============

The official documentation is hosted on SourceForge.

http://statsmodels.sourceforge.net/

The sphinx docs are currently undergoing a lot of work. They are not yet comprehensive, but should get you started.

This blog will continue to be updated as we make progress on the code.

Discussion and Development
==========================

All chatter will take place on the or scipy-user mailing list. We are very interested in receiving feedback about usability, suggestions for improvements, and bug reports via the mailing list or the bug tracker at https://bugs.launchpad.net/statsmodels.

August 31, 2009 11:01 PM

Andrew Friedley

This will probably be my last post for the project. Id like to thank again my mentor Chis Mueller and backup mentor Stéfan van der Walt, I think they did a lot in helping to make this project happen in the first place.

I've written a bunch of documentation on the CorePy wiki (a lot of it came from the code/README):

http://corepy.org/wiki/index.php?title=CoreFunc

When any updates to this work occur, the wiki page will be updated.

The SciPy talk went pretty well I think, though I wish I had more than 10 minutes to talk. A lot of important things got left out, and I felt there were places I didn't explain things very well. Even so, it was great to be able to go to SciPy and give a talk -- thanks to the people who make SciPy happen! There seemed to be quite a few people interested in CorePy; I had quite a few questions and had some nice discussions with various people.
Slides:
http://www.osl.iu.edu/~afriedle/presentations/scipy-corepy-afriedle.pdf

Video recording of my actual talk:
http://www.archive.org/details/scipy09_day1_08-Andrew_Friedley

As promised I used the CoreFunc framework to update Chris' particle simulation demo. Pretty cool stuff -- I was able to move the entire particle position/velocity update calculation into a ufunc. At 200,000 particles I was seeing ~100x speedup over the pure NumPy version, though my measurement wasn't particularly scientific. Check out the glparticle-cf.py file in the project repository/branch (listed in an earlier post).

August 31, 2009 11:10 AM

August 29, 2009

Jeff Ling

New Computer

Well GSoC is officially finished but there is still work to do. The connection for the whiteboard still needs to be closed, and there are many GUI-related problems. Unfortunately I don’t think I will be able to commit for a while because I just bought a new computer (asus G51VX-A1) and I need to install the dev environment on it. Also, I am going back to school in two days and it will be hectic until classes start.

I’m currently trying to get an internship for third year, or third year summer. This seems interesting :)

August 29, 2009 09:39 PM

August 27, 2009

Skipper Seabold

GSoC Is Over

Whoa, where did the last month go? The Google Summer of Code 2009 officially ended this Monday. Though I haven't taken a breath to update the blog, we (Josef and I) have been hard at work on the models code.

We have working and tested versions of Generalized Least Squares, Weighted Least Squares, Ordinary Least Squares, Robust Linear Models with several M-estimators, and Generalized Linear Models with support for all (almost all?) one parameter exponential family distributions. We have also provided some more convenience functions, created a standalone python package for the models code, and obtained permissions to distribute a few more datasets. Due to a lack of time, there is only experimental (read untested) support for autoregressive models, mixed effects models, generalized additive models, and convenience functions for returning strings (possibly html and latex output as well) with regression results and descriptive statistics. I will continue to work on these as I find time.

I will soon post a note on the progress that was made in the robust linear models code. Also, look out for a (semi-) official release of the code in the next few days. We have decided to name the project statsmodels and distribute it as a scikit. We need to finalize the documentation (should be ready to go in the next day or so...I am back taking courses) and clean up some of the usage examples, so people can jump right in and use the code, give feedback, and hopefully contribute extensions and new models.

As for the future of statsmodels, we are discussing over the next few weeks the immediate extensions that we know would like to make. It's looking like I will be wearing my microeconometrician hat this semester in my own coursework. More specifically, I will probably be working with cross-sectional and panel data models for household survey data in my own research and finding some time for time series models as part of my teaching assistantship. Josef has also mentioned wanting to work more with time series models.

If anyone (especially those from other disciplines) would like to contribute or see some extensions (my apologies to those who have made requests that I haven't yet been able to accomodate) feel free to post to the scipy-dev mailing list. I'm more than happy to discuss/debate with users and potential developers the design decisions that have been made, as I think the code is still in an unsettled enough state to merit some discussion.

August 27, 2009 08:59 PM

August 26, 2009

Kang Zhang

Python keyring lib, new blog address

After three days full-time working on Java and ActionScript 3.0, I finally get time to write a post about recent news.

#1 Python keyring lib released
We've released python keyring lib four days ago. It's exciting to see my work get released, and I've created a site for it.


Address of the site: http://home.python-keyring.org

You can find our mailing list, documentation page, repository and issue list on the site. If you've any advice or question, don't hesitate to tell us.
Someone reported installation problems on our tracker. I'll try to fix these bugs and release v0.2 at this weekend.

I'd like to thanks Tarek again for his great help on this project. I've learned a lot from him on this project, and I'll continue with the project to make it better. A new post about the tips and experiences I've got in this summer will be published later, when I get a new break. :-)

#2 New domain name for this blog and new homepage for myself
I'm going to move this blog to http://blog-en.kangzhang.org. To let the subscribers of this blog see this post and re-subscribe it at new address, it will be ready after the next 12 hours. (I really miss Tarek's Atomisator now.)
http://home.kangzhang.org is also available now.

August 26, 2009 09:38 AM

August 24, 2009

Dale Peterson

New examples of use with PyDy

In the last several days I’ve made a lot of updates to tools for using PyDy and have come very close to settling on the way that PyDy is used to derive the equations of motion.  I have added and updated several examples, including the double pendulum and a rigid body with two reaction wheels used for attitude control.  Animations for each example have been added using visual python, so visualization of the dynamics is very tangible.

There are two main things that I am still trying to iron out with PyDy.  The first is how to handle ignorable coordinates and how to allow for the user to control whether or not they are included in the output equations of motion.  For example, for the symmetric rolling disc, there are 4 ignorable coordinates:  two for the location in the ground plane, one for the heading and one for the spin.  For purposes of stability analysis, the kinematic differential equations associated with these coordinates are not needed.  However, for purposes of animation, these equations are needed.  My goal is to make it easy to clearly specify whether or not these equations are desired in the output equations.

The second major thing that still needs work is handling dependent generalized speeds in nonholonomic systems.  When the motion equations are generated, they will involve these dependent generalized speeds, and their time derivatives, but these quantities can be computed from the constraint equations and therefore can be left implicit in the final equations of motion.  The derivatives of the dependent generalized can be intelligently computed by some careful formations of the gradients of the components of the matrix that relates the dependent speeds to the independent ones.

I will be working on both of these tasks this week and will be writing examples for the Whipple bicycle model in addition to a spinning top and the rattleback.

August 24, 2009 05:11 AM

Kurt Smith

freemalloc


I’m glad to say that my Google Summer of code project was successful, despite the features that didn’t make it in fwrap by August 17th.

Oh, yeah — if you’re wondering what ‘fwrap’ is, that’s the new name of ‘f2cy’.  The rebranding was appropriate since ‘f2cy’ didn’t quite capture all that the utility does.  fwrap wraps fortran for a number of languages (C, Cython & Python), so ‘f2cy’ was deemed a misnomer since it seems to wrap fortran just for cython, besides the confusion with ‘f2py’.

fwrap was presented at the SciPy 2009 conference, and I was glad to find a good amount of interest; two (Kyle & Chris) stepped up to work on fwrap once I can get it to a state that’s comprehensible :-)   They work on a code called ‘clawpack’ at the U-Washington, and want to use fwrap for their ‘pyclaw’ version of clawpack.  But I’m  getting ahead of myself.

Here’s where you can find the presentation abstract (note that it was written early June — its heavy on promised features that didn’t all make it in fwrap by August 17th), and here are the talk slides.  The presentation itself is here.

What’s the state of fwrap?  It can handle scalar arguments with aplomb.  Assumed-shape arrays are working.  Explict-shaped & assumed-size arrays are soon to come.  I’m sketching out callbacks as we speak (a feature that Kyle & Chris are particularly interested in).

The parser used by fwrap needs stabilization — some obvious things need work (public/private module attributes), and I’m finding out just how unruly the full Fortran language specification is.  The language allows you to be expulsively verbose & clunky (‘integer(kind=FOOBAR), intent(in), dimension(2) :: an_int_array’) — yet difficult to fully parse, since each attribute (‘intent,’ ‘dimension,’ ’save,’ etc.) can instead be a statement on a line a ways away from the actual ‘integer an_int_array’ line; or you can be cryptic & bug-prone by using the implicit declaration anti-feature well-hated by all those who have ever had a misspelled variable in their code.

The above rant is a long way of saying that there are many little things that fparser chokes on.  This is not to be interpreted as a complaint with fparser — I’m grateful for all of Pearu’s work thus far on it, and for his ambition to tackle parsing such a barnacled language.  He has graciously opened up fparser to me to work on it, and has allowed us to package fparser with fwrap.  Thanks, Pearu!

Anyway, work on fwrap will be a bit slow for the next week or two, as I turn my full attention to my research.  But I’m energized and motivated to get the first release out before 2010.

August 24, 2009 03:47 AM

August 19, 2009

Fabian Pedregosa

Speed improvements for ask() (sympy.queries.ask)

I managed to overcome the overhead in ask() that arises when converting between symbol and integer representation of sentences in conjunctive normal.

The result went beyond what I expected. The test suite for the query module got 10x times faster in my laptop. From 26 seconds, it descended to an impressive 2.03 secs. There is still room for improvement, but it is no longer “so desperately slow”.

I’ll submit those patches soon to sympy’s trunk, but in the meantime they are in my logic branch:

git pull http://fseoane.net/git/sympy.git logic

August 19, 2009 10:36 PM

Wojciech Walczak

gminick


(Most up-to-date version is located in the repository on bitbucket)

Sphinx comments/fixes web application

Sphinx provides you with a feature of building your documentation as a web application, which gives you a way to interact with your users.

Users are permitted to submit comments as well as their fixes for the documentation. Developers have additional rights of deleting comments/fixes and committing fixes to the documentation repository.

Building a webapp along with documentation

Building your documentation along with a web application, which will serve it, is almost as simple as building the documentation itself.

1. Start a new Sphinx project:

$ mkdir sphinx-project

$ cd sphinx-project

$ sphinx-quickstart

[ ...usual sphinx-quickstart questions... ]

2. The documentation is empty as for now. Create at least one file with some content. Now, you can build a webapp which will serve your documentation:

$ make webapp

[...]

Build finished. The webapp HTML pages are in _build/webapp.

3. Run the server:

$ cd _build/webapp/

$ python server.py

Running at 127.0.0.1:8000…

OK, there it goes. Now you can check it with web browser!

The webapp configuration file

The variables and values of webapp.conf file are based on conf.py file.

NOTE: Always change the values of variables in conf.py file.

File structure of the build

After building your documentation, the _build/webapp/ directory gets populated with a number of directories and files.

Directories:

Files:

All for now, folks!

August 19, 2009 04:59 PM

August 18, 2009

Fabian Pedregosa

Logic module (sympy.logic): improving speed

Today I’ve been doing some speed improvements for the logic module. More precisely, I implemented an efficient internal representation for clauses in conjunctive normal form.

In practice this means a huge performance boost for all problems that make use the function satisfiable() or dpll_satisfiable(). For example, test_dimacs.py has moved from 2.7 seconds to an impressive 0.3 sec, and ask() runs on average 3x times faster, although both problems still have an overhead because of the conversion to this new representation that can be avoided in most times.

Now, the details. Traditionally, dpll (the algorithm that we use for deciding satisfiability) used to store clauses as arrays of symbols, and this worked fine, but sadly comparing symbols in sympy is slow, and this algorithm does a lot of comparisons … but we can map each sympy symbol to a unique integer, and with minor modifications to the algorithm we get these performance gains.

Now, the code. You can pull from my branch logic:

git pull http://fseoane.net/git/sympy.git logic

There are now some obvious performance tweaks we can do:

- in ask(), we can skip the conversion to integer representation by ‘precompiling’ known_facts_dict into this representation. This should be easy and will probably give performance boosts of several orders of magnitude.

- this integer representation is very similar to the one used in dimacs CNF files, so a parser that directly converts CNF files to this integer representation should make solving CNF files much faster.

I would like to give some credit to Ronan Lamy, who sent a patch some time ago, and although I did not include it (yet) into main sympy branch, it inspired me for these modifications.

August 18, 2009 09:35 PM

August 17, 2009

Aaron Meurer

asmeurer


[Sorry for the delay in this post. I was having some difficulties coming up with some of the rationales below. Also, classes have started, which has made me very busy.]

If there was one ODE solving method that I did not want to implement this summer, it was undetermined coefficients. I didn’t really like the method too much when we did it my my ODE class (though it was not as unenjoyable as series methods). The thing that I never really understood very well is to what extent you have to multiply terms in the trial function by powers of x to make them linearly independent of the solution to the general equation. We did our ODEs homework in Maple, so I would usually just keep trying higher powers of x until I got a solution. But to implement it in SymPy, I had to have a much better understanding of the exact rules for it.

From a user’s point of view, the method of undetermined coefficients is much better than the method of variation of parameters. While it is true that variation of parameters is a general method and undetermined coefficients only works on a special class of functions, undetermined coefficients requires no integration or advanced simplification, so it is fast (very fast, as well shall see below). All that the CAS has to do is figure out what a trial function looks like, plug it into the ODE, and solve for the coefficients, which is a system of linear equations.

On the other hand, from the programmer’s point of view, variation of parameters is much better. All you have to do is take the Wronskian of the general solution set and use it to set up some integrals. But the Wronskian has to be simplified, and if the general solution contains sin’s and cos’s, this requires trigonometric simplification not currently available in SymPy (although it looks like the new Polys module will be making a big leap forward in this area). Also, integration is slow, and in SymPy, it often fails (hangs forever).

Figuring out what the trial function should be for undetermined coefficients is way more difficult to program, but having finnally finished it, I can say that it is definitely worth having in the module. Problems that it can solve can run orders of magnitude faster than the variation of parameters, and often variation of parameters can’t do the integral or returns a less simplified result.

So what is this undetermined coefficients? Well, the idea is this: if you knew what each linearly independent term of the particular solution was, minus the coefficients, then you could just set each coefficient as an unknown, plug it into the ODE, and solve for them. It turns out that resulting system of equations is linear, so if you do the first part right, you can always get a solution.

The key thing here is that you know what form the particular solution will take. However, you don’t really know this ahead of time. All you have is the linear ode a_ny^{(n)}(x) + \dots + a_1y'(x) + a_0y(x) = F(x) (as far as I can tell, this only works in the case where the coefficients a_i are constant with respect to x. I’d be interested to learn that it works for other linear ODEs. At any rate, that is the only one that works in my branch right now.). The solution to the ode is y(x) = y_g(x) + y_p(x), where y_g(x) is the solution to the homogeneous equation f(x) \equiv 0, and y_p(x) is the particular solution that produces the F(x) term on the right hand side. The key here is just that. If you plug y_p(x) into the left hand side of the ode, you get F(x).

It turns out that this method only works if the function F(x) only has a finite number of linearly independent derivatives (I am unsure, but this might be able to work in other cases, but it would involve much more advanced mathematics). So what kind of functions have a finite number of linearly independent solutions? Obviously, polynomials do. So does e^x, \cos{x}, and \sin{x}. Also, if we multiply two or more of these types together, then we will get a finite number of linearly independent solutions after applying the product rule. But is that all? Well, if we take the definition of linear independence from linear algebra, we know that a set of n vectors \{\boldsymbol{v_1}, \boldsymbol{v_2}, \boldsymbol{v_3}, \dots, \boldsymbol{v_n}\}, not all zero, are linearly independent only if a_1\boldsymbol{v_1} + a_2\boldsymbol{v_2} + a_3\boldsymbol{v_3} + \dots + a_n\boldsymbol{v_n}=0 holds only when a_1 \equiv 0, a_2 \equiv 0, a_3 \equiv 0, \dots, a_n \equiv 0, that is, the only solution is the trivial one (remember, this is the definition of linear independence). They are linearly dependent if there exist weights a_1, a_2, a_3, \dots, a_n, not all 0, such that the equation a_1\boldsymbol{v_1} + a_2\boldsymbol{v_2} + a_3\boldsymbol{v_3} + \dots + a_n\boldsymbol{v_n}=0 is satisfied. Using this definition, we can see that a function f(x) will have a finite number of linearly independent derivatives if it satisfies a_nf^{(n)}(x) + a_{n - 1}f^{(n - 1)}(x) + \dots + a_1f'(x) + a_0f(x) = 0 for some n and with a_i\neq 0 for some i. But this is just a homogeneous linear ODE with constant coefficients, which we know how to solve. The solutions are all of the form ax^ne^{b x}\cos{cx} or ax^ne^{b x}\sin{cx}, where a, b, and c are real numbers and n is a non-negative integer. We can set the various constants to 0 to get the type we want. For example, for a polynomial term, b will be 0 and c will be 0 (use the cos term).

So this gives us the exact form of functions that we need to look for to apply undetermined coefficients, based on the assumption that it only works on functions with a finite number of linearly independent derivatives.

Well, implementing it was quite difficult. For every ODE, the first step in implementation is matching the ODE, so the solver can know what methods it can apply to a given ODE. To match in this case, I had to write a function that determined if the function matched the form given above, which was not too difficult, though not as trivial as just grabbing the right hand side in variation of parameters. The next step is to use the matching to format the ODE for the solver. In this case, it means finding all of the finite linearly independent derivatives of the ODE, so that the solver can just create a linear combination of them solve for the coefficients. This was a little more difficult, and it took some lateral thinking.

At this point, there is one more thing that needs to be noted. Since the trial functions, that is, the linearly independent derivative terms of the right hand side of the ODE, are of the same form as the solutions to the homogeneous equation, it is possible that one of the trial function terms will be a solution to the homogeneous equation. If this happens, plugging it into the ODE will cause it to go to zero, which means that we will not be able to solve for a coefficient for that term. Indeed, that term will be of the form C1*\textrm{term} in the final solution, so even if we had a coefficient for it, it would be absorbed into this term from the solution to the homogeneous equation. For example, variation of parameters will give a coefficient for such terms, even though it is unnecessary. This is a clue that Maple uses variation of parameters for all linear constant coefficient ODE solving, because it gives the unnecessary terms with the coefficients that would be given by variation of parameters, instead of absorbing them into the arbitrary constants.

We can safely ignore these terms for undetermined coefficients, because their coefficients will not even appear in the system of linear equations of the coefficients anyway. But, without these coefficients, we will run into trouble. It turns out that if a term x^ne^{ax}\sin{bx} or x^ne^{ax}\cos{bx} is repeated solution to the homogeneous equation, and x^{n + 1}e^{ax}\sin{bx} or x^{n + 1}e^{ax}\cos{bx} is not, so that n is the highest x power that makes it a solution to the homogeneous equation, and if the trial solution has x^me^{ax}\sin{bx} or x^me^{ax}\cos{bx} terms, but not x^{m + 1}e^{ax}\sin{bx} or x^{m + 1}e^{ax}\cos{bx} terms, so that m is the highest power of x in the the trial function terms, then we need to multiply these trial function terms by x^{n + m} to make them linearly independent with the solutions of the homogeneous equation.

Most references simply say that you need to multiply the trial function terms by “sufficient powers of x” to make them linearly independent with the homogeneous solution. Well, this is just fine if you are doing it by hand or you are creating the trial function manually in Maple and plugging it in and solving for the coefficients. You can just keep upping the powers of x until you get a solution for the coefficients. Creating those trial functions in Maple, plugging them into the ODE, and solving for the coefficients is exactly what I had to do for my homework when I took ODEs last spring, and this “upping powers” trial and error method is exactly the method I used. But when you are doing it in SymPy, you need to know exactly what power to multiply it by. If it is too low, you will not get solution to the coefficients. If it is too high, you can actually end up with too many terms in the final solution, giving a wrong answer.

Fortunately, my excellent ODEs textbook gives the exact cases to follow, and so I was able to implement it correctly. The textbook also gives a whole slew of exercises, all for which the solutions are given. As usual, this helped me to find the bugs in my very complex and difficult to write routine. It also helped me to find a match bug that would have prevented dsolve() from being able to match certain types of ODEs. The bug turned out to be fundamental to the way match() is written, so I had to write my own custom matching function for linear ODEs.

The final step in solving the undetermined coefficients is of course just creating a linear combination of the trial function terms, plugging it into the original ODE, and setting the coefficients of each term on each side equal to each other, which gives a linear system. SymPy can solve these easily, and once you have the values of the coefficients, you can use them to build your particular solution, at which point, you are done.

The results were astounding. Variation of parameters would hang on many simple inhomogeneous ODEs because of poor trig simplification of the Wronsikan, but my undetermined coefficients method handles them perfectly. Also, there is no need to worry about absorbing superfluous terms into the arbitrary constants as with variation of parameters, because they are removed from within the undetermined coefficients algorithm.

But the biggest thing was speed. Here are some benchmarks on some random ODEs from the test suite. WordPress code blocks are impervious to whitespace, as I have mentioned before, so no pretty printing here. Also, it truncates the hints. The hints used are 'nth_linear_constant_coeff_undetermined_coefficients' and 'nth_linear_constant_coeff_variation_of_parameters':

In [1]: time dsolve(f(x).diff(x, 2) - 3*f(x).diff(x) - 2*exp(2*x)*sin(x), f(x), hint='nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.07 s, sys: 0.00 s, total: 0.08 s
Wall time: 0.08 s
Out[2]:
f(x) == C1 + (-3*sin(x)/5 - cos(x)/5)*exp(2*x) + C2*exp(3*x)

In [3]: time dsolve(f(x).diff(x, 2) - 3*f(x).diff(x) - 2*exp(2*x)*sin(x), f(x), hint='nth_linear_constant_coeff_variation_of_parameters')
CPU times: user 0.92 s, sys: 0.01 s, total: 0.93 s
Wall time: 0.94 s
Out[4]:
f(x) == C1 + (-3*sin(x)/5 - cos(x)/5)*exp(2*x) + C2*exp(3*x)

In [5]: time dsolve(f(x).diff(x, 4) - 2*f(x).diff(x, 2) + f(x) - x + sin(x), f(x), hint='nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.06 s, sys: 0.00 s, total: 0.06 s
Wall time: 0.06 s
Out[6]:
f(x) == x - sin(x)/4 + (C1 + C2*x)*exp(x) + (C3 + C4*x)*exp(-x)

In [7]: time dsolve(f(x).diff(x, 4) - 2*f(x).diff(x, 2) + f(x) - x + sin(x), f(x), hint='nth_linear_constant_coeff_variation_of_parameters')
CPU times: user 5.43 s, sys: 0.03 s, total: 5.46 s
Wall time: 5.52 s
Out[8]:
f(x) == x - sin(x)/4 + (C1 + C2*x)*exp(x) + (C3 + C4*x)*exp(-x)

In [9]: time dsolve(f(x).diff(x, 5) + 2*f(x).diff(x, 3) + f(x).diff(x) - 2*x - sin(x) - cos(x), f(x), 'nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.10 s, sys: 0.00 s, total: 0.10 s
Wall time: 0.11 s
Out[10]:
f(x) == C1 + (C2 + C3*x - x**2/8)*sin(x) + (C4 + C5*x + x**2/8)*cos(x) + x**2

In [11]: time dsolve(f(x).diff(x, 5) + 2*f(x).diff(x, 3) + f(x).diff(x) - 2*x - sin(x) - cos(x), f(x), 'nth_linear_constant_coeff_variation_of_parameters')


The last one involves a particularly difficult Wronskian for SymPy (run it with hint=’nth_linear_constant_coeff_variation_of_parameters_Integral’, simplify=False).

Wall time comparisons reveal amazing speed differences. We’re talking orders of magnitude.

In [13]: 0.94/0.08
Out[13]: 11.75

In [14]: 5.52/0.06
Out[14]: 92.0

In [15]: oo/0.11
Out[15]: +inf


Of course, variation of parameters has the most difficult time when there are sin and cos terms involved, because of the poor trig simplification in SymPy. So let’s see what happens with an ODE that just has exponentials and polynomial terms involved.

In [16]: time dsolve(f(x).diff(x, 2) + f(x).diff(x) - x**2 - 2*x, f(x), hint='nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.10 s, sys: 0.00 s, total: 0.10 s
Wall time: 0.10 s
Out[17]:
f(x) == C1 + x**3/3 + C2*exp(-x)

In [18]: time dsolve(f(x).diff(x, 2) + f(x).diff(x) - x**2 - 2*x, f(x), hint='nth_linear_constant_coeff_variation_of_parameters')
CPU times: user 0.19 s, sys: 0.00 s, total: 0.19 s
Wall time: 0.20 s
Out[19]:
f(x) == C1 + x**3/3 + C2*exp(-x)

In [20]: time dsolve(f(x).diff(x, 3) + 3*f(x).diff(x, 2) + 3*f(x).diff(x) + f(x) - 2*exp(-x) + x**2*exp(-x), f(x), hint='nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.09 s, sys: 0.00 s, total: 0.09 s
Wall time: 0.09 s
Out[21]:
f(x) == (C1 + C2*x + C3*x**2 + x**3/3 - x**5/60)*exp(-x)

In [22]: time dsolve(f(x).diff(x, 3) + 3*f(x).diff(x, 2) + 3*f(x).diff(x) + f(x) - 2*exp(-x) + x**2*exp(-x), f(x), hint='nth_linear_constant_coeff_variation_of_parameters')
CPU times: user 0.29 s, sys: 0.00 s, total: 0.29 s
Wall time: 0.29 s
Out[23]:
f(x) == (C1 + C2*x + C3*x**2 + x**3/3 - x**5/60)*exp(-x)


The wall time comparisons here are:

In [24]: 0.20/0.10
Out[24]: 2.0

In [25]: 0.29/0.09
Out[25]: 3.22222222222

So we don’t have orders of magnitude anymore, but it is still 2 to 3 times faster. Of course, most ODEs of this form will have sin or cos terms in them, so the order of magnitude improvement over variation of parameters can probably be attributed to undetermined coefficients in general.

Of course, we know that variation of parameters will still be useful, because functions like \ln{x}, \sec{x} and \frac{1}{x} do not have a finite number of linearly independent derivatives, and so you cannot apply the method of undetermined coefficients to them.

There is one last thing I want to mention. You can indeed multiply any polynomial, exponential, sin, or cos functions together and still get a function that has a finite number of linearly independent solutions, but if you multiply two or more of the trig functions, you have to apply the power reduction rules to the resulting function to get it in terms of sin and cos alone. Unfortunately, SymPy does not yet have a function that can do this, so to solve such a differential equation with undetermined coefficients (recommended, see above), you will have to apply them manually yourself. Also, just for the record, it doesn’t play well with exponentials in the form of sin’s and cos’s or the other way around (complex coefficients on the arguments), so you should back convert those first too.

Well, this concludes the first of two blog posts that I promised. I also promised that I would write about my summer of code experiences. Not only is this important to me, but it is a requirement. I really hope to get this done soon, but with classes, who knows.

August 17, 2009 10:33 PM

asmeurer


Last weekend, Luke came to visit Ondrej in Los Alamos, so I decided to drive him up from Albuquerque and visit him again. It was nice meeting Luke and seeing Ondrej again.

Aside from coding (the main thing that I did was fix an ugly match bug that was preventing dsolve() from recognizing certain ODEs), we visited the atomic museum in Los Alamos, the Valles Caldera, and some of the surrounding hot springs.

Here are some pictures that Luke took with his iPhone. Stupid WordPress seems to insist on flipping some of them (I can’t fix it):
Luke Me and Luke Me Me and Luke Me and Ondrej Me and Ondrej The Valles Caldera Ondrej and Me The road Ondrej and Me

This is one of three posts that I plan on doing this week. I just finished my GSoC project today/last night, so I will be blogging about that. I plan on doing a post on the method of Undetermined Coefficients, as well as some other things that I managed to do. The other post will be my general musings/advice for GSoC. That will probably be my last post here in a while. I plan on continuing work with SymPy, but I get very busy with classes, so I most likely won’t be doing much until next summer.

August 17, 2009 06:42 PM

Fabian Pedregosa

Refine module

This commit introduced a new module in sympy: the refine module.

The purpose of this module is to simplify expressions when they are bound to assumptions. For example, if you know that x>0, then you can simplify abs(x) to x. This code was traditionally embedded into the core, but now this will be part of an external module (sympy.refine) upon which the core has no dependencies.

In a not very original move, I named the main function in this module refine(). It’s syntax is very straightforward: first argument is an expression and second argument are assumptions. Some examples (from isympy):

In [1]: refine(1+abs(x), Assume(x, Q.positive))
Out[1]: 1 + x

In [2]: refine(exp(I*x*pi), Assume(x, Q.odd))
Out[2]: -1

In [3]: refine(exp(I*x*pi), Assume(x, Q.even))
Out[3]: 1

Right now the module lacks some rules, but the design (very similar to the query module) will make adding these rules an easy task.

August 17, 2009 05:20 PM

August 16, 2009

Andrew Friedley

Closing in on the finish here. I think I've achieved all the goals described in my original proposal, and ended up going a little further using the framework to write custom ufuncs. To refresh, the SVN is here:

https://svn.osl.iu.edu/trac/corepy/browser/branches/afriedle-gsoc-ufunc

I cleaned up the framework a little, and extended it to support multiple output arrays. In fact, any combination of inputs and outputs is supported as long as the total is = 5. 5 is a limitation of the x86_64 register file -- 2 registers are required for each input or output (a pointer and a element 'step' or increment value), and I've managed to clear up 10 registers for this purpose. I could support 10 inputs/outputs if I pushed the step values onto the stack (or chose not to use them); I wouldn't expect anything more than a minor performance hit from this, if any.

I've experimented with a couple more ufuncs. Wanting to try conditionals, I implemented a 'greater' ufunc and ran into an issue -- I don't know of a good way to return a boolean array (Python boolean objects) from a CorePy-based ufuncs; all I have is the native integer/float types. If 1/0 integer values were suitable as truth values, the conditional ufuncs would be no problem.

Moving along, I tested the multiple-output support by writing an 'addsub' ufunc. This ufunc takes two arrays as inputs, and outputs two arrays. The first output array contains the sum of each pair of elements, while the second output array contains the difference. Nothing fancy there, just a simple test to make sure my code did what I wanted.

Mark Jeffree suggested vector normalization as a good example for a custom ufunc:

L = 1.0/sqrt(x**2 + y**2)
X = x * L; Y = y * L

This ended up pretty fun to work with, as it is simple, but just involved enough to allow playing around with various optimization tricks. What I ended up doing was using the AMD optimization manual's suggestion of using the reciprical-squareroot instruction combined with a single iteration of newton-raphson to get accuracy slightly less than IEEE, but with better performance than the sqrt instruction. It's kind of funny how a sequence of ~10 instructions is faster than one single instruction! Another advantage is that the longer instruction sequence pipelines much better when unrolling loop iterations. The sqrt instruction has a latency of something like 18 cycles (might be wrong, that's off the top of my head), and a new sqrt instruction can only be issued every 15 cycles. On the other hand, reciprocal-squareroot with an iteration of newton-raphson uses a series of instructions with 4 cycles or less of latency and all can be issued every cycle.

The approach the AMD optimization manual described combined recprocal-squareroot with newton-raphson to get the squareroot as the result (they describe a similar approach for division). I actually wanted the reciprocal of the square root to avoid the division of each component by the length, and was able to remove one multiplication from the newton-raphson. For some reason SSE only has the reciprocal-squareroot and reciprocal instructions for 32-bit floats, not 64-bit floats. So the 64-bit float implementation was pretty boring; I just used sqrt and division instructions.

I just spent a while writing a README and fairly detailed documentation of the gen_ufunc() and create_ufunc() functions that make up my framework. Hopefully things are clear enough that others can use this work. This more or less wraps up my project! Over the next week I'm going to try my hand at implementing a custom ufunc to do the entire computation for a particle simulation demo Chris Mueller (my mentor) did. I'm also attending SciPy next week; I will be giving a talk on recent CorePy work on thursday I believe. Since the talks got cut down to 10 minutes instead of 35, I'll barely have time to highly various things we have going on, spending the most time on what's most relevant to this audience -- my GSoC project.

Expect one more post next week with an update on the particle system and SciPy.

August 16, 2009 02:47 PM