Main Menu
Welcome
No forthcoming events
Personal Beef with Image Processing Tools
Jamie Kawabata, Tuesday, February 05, 2008 - 02:32PM // comment: 0
Ok so this is going to be a miniature rant. I'm not going to pick out any particular tool, since pretty much all of them do it. (All of them that I know of.) Most recently I've been using ImageMagick, so my current frustration is at that, but Photoshop and GIMP both do it also.
Here's the thing, an image uses integers, usually 0 to 255 (called "code values"), to represent the brightness of each color of each pixel. No problem there. And the relationship between the code values to actual brightness is not linear. No problem there. Actually the fact that the relationship is not linear is a good thing. The problem occurs when these retarded programs act as if the relationship is linear. Pretty much anyone who knows anything about image processing knows this, and yet all these programs act the same.
For a grayscale image, the code value of 0 means black while 255 means white. Because of the nonlinear relationship between code values and lightness, the code values 128 is not half as bright as 255. It's not even close. But if you blur a black and white (0 and 255) checkerboard pattern, the resulting value is 128. The correct code value which corresponds to a lightness halfway in between is around 186.
This problem also occurs when alpha compositing (my most recent encounter). When blending between pure red, which has RGB code values of (255, 0, 0) and pure green (0, 255, 0), at the 50% point, the code values evaluate to (128, 128, 0). But 128 is much dimmer than 50 percent of the brightness, and as a result, the mixture is darker than either of the constituent colors.
Here is an example:

If you look closely at the perimeter of the red circle, you can see where blending of the red and green makes for a darker combination than either of the constituent colors:

Since I care about this sort of thing, I just constructed a method for what I call "gamma-aware alpha compositing" or just "gamma compositing" for short, which produces this better result:

Zooming in shows that it blends properly from red to green:

I shouldn't have to do this. It should work this way already. Why does nobody write programs that work this way? Why do all these programs behave as if the relationship between code value and lightness were linear? Yes, it's simpler, but these same programs offer "gamma" correction, so they should know better.
Ok, end of rant.
Here's the thing, an image uses integers, usually 0 to 255 (called "code values"), to represent the brightness of each color of each pixel. No problem there. And the relationship between the code values to actual brightness is not linear. No problem there. Actually the fact that the relationship is not linear is a good thing. The problem occurs when these retarded programs act as if the relationship is linear. Pretty much anyone who knows anything about image processing knows this, and yet all these programs act the same.
For a grayscale image, the code value of 0 means black while 255 means white. Because of the nonlinear relationship between code values and lightness, the code values 128 is not half as bright as 255. It's not even close. But if you blur a black and white (0 and 255) checkerboard pattern, the resulting value is 128. The correct code value which corresponds to a lightness halfway in between is around 186.
This problem also occurs when alpha compositing (my most recent encounter). When blending between pure red, which has RGB code values of (255, 0, 0) and pure green (0, 255, 0), at the 50% point, the code values evaluate to (128, 128, 0). But 128 is much dimmer than 50 percent of the brightness, and as a result, the mixture is darker than either of the constituent colors.
Here is an example:

If you look closely at the perimeter of the red circle, you can see where blending of the red and green makes for a darker combination than either of the constituent colors:

Since I care about this sort of thing, I just constructed a method for what I call "gamma-aware alpha compositing" or just "gamma compositing" for short, which produces this better result:

Zooming in shows that it blends properly from red to green:

I shouldn't have to do this. It should work this way already. Why does nobody write programs that work this way? Why do all these programs behave as if the relationship between code value and lightness were linear? Yes, it's simpler, but these same programs offer "gamma" correction, so they should know better.
Ok, end of rant.
Implication Chains Update
Jamie Kawabata, Monday, February 04, 2008 - 04:25PM // comment: 0
Okay, I've updated the Sudoku code to use implication chains, and it gets pretty close. But there are still a few patterns that it can't solve, and as a result, some puzzles are beyond the capability of this program.
For now it's going on hold. I need to get my graphics generator working first.
For now it's going on hold. I need to get my graphics generator working first.
Sudoku Implication Chains
Jamie Kawabata, Thursday, January 31, 2008 - 12:49PM // comment: 0
Good news / bad news: Bad news is my sudoku solver doesn't solve some puzzles, when previously I had believed that it would solve all of them. The good news is that I get to fix it!
Yesterday I was reading some tips on solving sudoku, and they described a technique called the "X-wing". Interesting. My program doesn't use any such technique and until yesterday it had solved even the hardest problems I could find to test it.
Better yet, they had an example puzzle to demonstrate the X-wing pattern, and sure enough, my program solved it up until the point where the pattern was required, and could get no further.
There were a couple other special patterns as well as the X-wing pattern, but the good news is that I believe they all can be solved with a single algorithm using implication chains.
When there are only two possibilities for a cell, say A and B, if one of them is eliminated, then obviously that cell is known. Which means if another cell in the same row, column, or region were to be assigned A, it would imply that the cell must be B, which could then go on to imply values for other cells as well.
We can construct implication chains without actually knowing the values of the cells, and then the implication chains that criss-cross the puzzle can tell us a lot about what is and is not possible.
Yesterday I was reading some tips on solving sudoku, and they described a technique called the "X-wing". Interesting. My program doesn't use any such technique and until yesterday it had solved even the hardest problems I could find to test it.
Better yet, they had an example puzzle to demonstrate the X-wing pattern, and sure enough, my program solved it up until the point where the pattern was required, and could get no further.
There were a couple other special patterns as well as the X-wing pattern, but the good news is that I believe they all can be solved with a single algorithm using implication chains.
When there are only two possibilities for a cell, say A and B, if one of them is eliminated, then obviously that cell is known. Which means if another cell in the same row, column, or region were to be assigned A, it would imply that the cell must be B, which could then go on to imply values for other cells as well.
We can construct implication chains without actually knowing the values of the cells, and then the implication chains that criss-cross the puzzle can tell us a lot about what is and is not possible.
The Problem With Themes
Jamie Kawabata, Tuesday, January 29, 2008 - 08:00PM // comment: 0
A lot of web applications support "themes" or "templates" that allow the administrators to customize the look and feel without interfering with the underlying machinery that makes the site work. This is a good thing.
There are a lot of themes for these software packages that look really good. Some suck, but it is surprising how many don't. For any given themeable software package, there are dozens of very attractive themes to choose from. This is a good thing.
If you have a particular color scheme you're looking to use (say, if color were related to your brand) then you will be pretty much out of luck unless you come across an existing theme that happens to use your colors. You may be able to change only the style sheets to get most of the way there, but if the scheme has boxes with rounded corners, those graphics will have to be regenerated. Only the most old-fashioned rectangular sites can be adjusted using style sheets alone.
If you don't require a particular color or look, you can simply pick an attractive theme and let that be your look. But as soon as you want to use two applications, you are in trouble. It will be a lot of work to get them to look the same.
What
Suppose there was a web application that, given a color scheme and a theme template generate a theme. Depending on the complexity of the theme itself, the theme template may be largely static with a few key places replaced by the customization. Then with one theme template for each application, the look (at least the colors) could be customized and the applications would match each other and match your brand.
In order to make the auto-customizable themes look good, some use of rounded corners and gradients will usually be necessary. For this, code will have to be written to generate the appropriate edge and corner graphics.
I've started working on mini-scripts that generate buttons with shading, nicely anti-aliased, with an optional "gloss" effect that's cheesy but seems to be popular these days. We'll see how it goes. I would like to generate money, to get rewarded for the trouble, but I'm not sure how. I think I may have the basic setup for free, but some "special" features may require an extra (fairly low) fee.
There are a lot of themes for these software packages that look really good. Some suck, but it is surprising how many don't. For any given themeable software package, there are dozens of very attractive themes to choose from. This is a good thing.
If you have a particular color scheme you're looking to use (say, if color were related to your brand) then you will be pretty much out of luck unless you come across an existing theme that happens to use your colors. You may be able to change only the style sheets to get most of the way there, but if the scheme has boxes with rounded corners, those graphics will have to be regenerated. Only the most old-fashioned rectangular sites can be adjusted using style sheets alone.
If you don't require a particular color or look, you can simply pick an attractive theme and let that be your look. But as soon as you want to use two applications, you are in trouble. It will be a lot of work to get them to look the same.
What
Suppose there was a web application that, given a color scheme and a theme template generate a theme. Depending on the complexity of the theme itself, the theme template may be largely static with a few key places replaced by the customization. Then with one theme template for each application, the look (at least the colors) could be customized and the applications would match each other and match your brand.
In order to make the auto-customizable themes look good, some use of rounded corners and gradients will usually be necessary. For this, code will have to be written to generate the appropriate edge and corner graphics.
I've started working on mini-scripts that generate buttons with shading, nicely anti-aliased, with an optional "gloss" effect that's cheesy but seems to be popular these days. We'll see how it goes. I would like to generate money, to get rewarded for the trouble, but I'm not sure how. I think I may have the basic setup for free, but some "special" features may require an extra (fairly low) fee.Iridium Flare Gadget Completed
Jamie Kawabata, Tuesday, January 29, 2008 - 05:51PM // comment: 0
Finally. This is something I had started a while back, and left it 90% done, and I finally decided to polish it for general release.
What:
This is a "gadget" that pulls Iridium flare data from a website (heavens-above.com) and formats it so they can be displayed on your Google Homepage.
If you don't know what an Iridium Flare is, there is a rudimentary introduction at Wikipedia.
When installed in your Google Homepage, it looks like this:

Why:
There are some programs available for Iridium flare predictions, but to my knowledge heavens-above is the only website that will calculate predictions for you. It is not possible to subscribe to a news feed of flare predictions for a location, so if you want to stay informed, you must:
1. Go to the website frequently (and write down or try to remember when the flares will be).
or
2. Download one of the flare prediction programs and download orbital data periodically and run the program periodically.
With the gadget for your Google Homepage, you can stay informed every time you visit your Google Homepage. For me this is quite frequent, since I keep all my bookmarks in Google instead of on my computer itself.
To add the gadget to your Google Homepage, click here:

Note, you will have to input your latitude and longitude before you can get Iridium flare predictions.
If you want to know more about this gadget and how it works, or if you're looking to create your own mashup, post a comment here or shoot me an email.
What:
This is a "gadget" that pulls Iridium flare data from a website (heavens-above.com) and formats it so they can be displayed on your Google Homepage.
If you don't know what an Iridium Flare is, there is a rudimentary introduction at Wikipedia.
When installed in your Google Homepage, it looks like this:

Why:
There are some programs available for Iridium flare predictions, but to my knowledge heavens-above is the only website that will calculate predictions for you. It is not possible to subscribe to a news feed of flare predictions for a location, so if you want to stay informed, you must:
1. Go to the website frequently (and write down or try to remember when the flares will be).
or
2. Download one of the flare prediction programs and download orbital data periodically and run the program periodically.
With the gadget for your Google Homepage, you can stay informed every time you visit your Google Homepage. For me this is quite frequent, since I keep all my bookmarks in Google instead of on my computer itself.
To add the gadget to your Google Homepage, click here:

Note, you will have to input your latitude and longitude before you can get Iridium flare predictions.
If you want to know more about this gadget and how it works, or if you're looking to create your own mashup, post a comment here or shoot me an email.
Book notes web app
Jamie Kawabata, Friday, January 25, 2008 - 11:29AM // comment: 0
Background
Blogs linking to each other has created an interesting environment where public online "debates" or "discussions" can occur. With popular blogs, salient incorectness is called out. When such an incorectness is exposed, it can damage the original author's reputation, as it should, (depending on the nature of the incorrectness). This provides a sort of check-and-balance among sparring factions.
Whether an incorrectness is an exaggeration, a lie, or simply an error comes down to a question of the intent of the original author. It is a tricky subject to navigate and is outside the scope being considered here. In this article these terms will be used interchangeably.
For books and other mass media, the feedback suffers from a problem in that the commentary usually has much less publicity than the media itself. A magazine may print a retraction of an article, or a correction, but such a correction has much less impact than the original article. A subject matter expert may refute the claims made in a book point by point, with backup documentation, but fewer people will read the refutation than will read the book.
For counter-arguments and counter-claims that are subsequent to the original publication, this is natural and perhaps inevitable. But even when a claim has been thoroughly discredited, the original media can continue to be consumed, with readers oblivious to the shaky reputation of the writer. For books, which have a relatively long lifespan, this is the most problematic.
This asymmetry between claim and counter-claim allows media publishers to operate mostly "open loop." Incorrectnesses and exaggerations do not return to the source and affect their reputation to the extent that they should.
It would be cool if...
It would be great if someone (I) created a tool that could shorten and strengthen the feedback loop. Specifically factual errors or exaggerations should feed directly back to the original author's reputation.
Honestly, I have an idea for a tool, and this is a somewhat grandiose backwards rationalization that extends to changing the culture of media. The actual usefulness of the tool I have in mind is much more mundane.
More later. Perhaps.
Blogs linking to each other has created an interesting environment where public online "debates" or "discussions" can occur. With popular blogs, salient incorectness is called out. When such an incorectness is exposed, it can damage the original author's reputation, as it should, (depending on the nature of the incorrectness). This provides a sort of check-and-balance among sparring factions.
Whether an incorrectness is an exaggeration, a lie, or simply an error comes down to a question of the intent of the original author. It is a tricky subject to navigate and is outside the scope being considered here. In this article these terms will be used interchangeably.
For books and other mass media, the feedback suffers from a problem in that the commentary usually has much less publicity than the media itself. A magazine may print a retraction of an article, or a correction, but such a correction has much less impact than the original article. A subject matter expert may refute the claims made in a book point by point, with backup documentation, but fewer people will read the refutation than will read the book.
For counter-arguments and counter-claims that are subsequent to the original publication, this is natural and perhaps inevitable. But even when a claim has been thoroughly discredited, the original media can continue to be consumed, with readers oblivious to the shaky reputation of the writer. For books, which have a relatively long lifespan, this is the most problematic.
This asymmetry between claim and counter-claim allows media publishers to operate mostly "open loop." Incorrectnesses and exaggerations do not return to the source and affect their reputation to the extent that they should.
It would be cool if...
It would be great if someone (I) created a tool that could shorten and strengthen the feedback loop. Specifically factual errors or exaggerations should feed directly back to the original author's reputation.
Honestly, I have an idea for a tool, and this is a somewhat grandiose backwards rationalization that extends to changing the culture of media. The actual usefulness of the tool I have in mind is much more mundane.
More later. Perhaps.
phpBB2 import complete -- with a twist
Jamie Kawabata, Thursday, January 24, 2008 - 10:36AM // comment: 0
I repackaged the batch conversion script, so that instead of multiple php files, one for each function, I created a single file with multiple functions. Along the way, with my modification, I made what seemed a 'minor' change to the SQL that pulls out the topics.
I had changed this:
to this:
It turns out that this change all by itself dramatically improves performance, to the point that the script converts my entire phpBB2 forum in less than 15 seconds. And my site is well over the 15 Mb limit described here. (As of this writing -- we'll see if they fold the change back into the main script.)
So on the one hand, I'm happy for finding the SQL problem, and on the other hand, I did a lot more work than I had to, in order to convert my modest phpBB2 site. There may not be any demand for my batch-conversion script, which will probably mean that I won't get any notoriety for it.
Oh well. If anyone wants it, send me an email or pm.
I had changed this:
$query = "SELECT * FROM {$phpbb2Prefix}topics
LEFT JOIN {$phpbb2Prefix}posts_text ON ({$phpbb2Prefix}topics.topic_title = {$phpbb2Prefix}posts_text.post_subject)
LEFT JOIN {$phpbb2Prefix}posts ON ({$phpbb2Prefix}posts.post_id = {$phpbb2Prefix}posts_text.post_id)
ORDER BY topic_time ASC";to this:
$query = "SELECT * FROM {$phpbb2Prefix}topics
LEFT JOIN {$phpbb2Prefix}posts ON ({$phpbb2Prefix}posts.post_id = {$phpbb2Prefix}topics.topic_first_post_id)
LEFT JOIN {$phpbb2Prefix}posts_text ON ({$phpbb2Prefix}posts_text.post_id = {$phpbb2Prefix}posts.post_id)
ORDER BY topic_time ASC";It turns out that this change all by itself dramatically improves performance, to the point that the script converts my entire phpBB2 forum in less than 15 seconds. And my site is well over the 15 Mb limit described here. (As of this writing -- we'll see if they fold the change back into the main script.)
So on the one hand, I'm happy for finding the SQL problem, and on the other hand, I did a lot more work than I had to, in order to convert my modest phpBB2 site. There may not be any demand for my batch-conversion script, which will probably mean that I won't get any notoriety for it.
Oh well. If anyone wants it, send me an email or pm.
Importing large forums from phpBB2
Jamie Kawabata, Wednesday, January 23, 2008 - 01:46PM // comment: 0
Why
Another website I manage is currently using phpBB2 for its forums, and I am contemplating moving it over to e107. The problem is that the database is "large". According to the instructions on importing from phpBB2 to e107, the phpBB2 database must be trimmed down to 15Mb or smaller, else the import will timeout.
Now honestly, 15 Mb doesn't sound very large to me, not with the cost of today's storage and bandwidth. But they aren't kidding. I tried it anyway with my database of 21 Mb, ignoring the warning that trimming the database is "100% vital as the script will timeout otherwise". Not only did it not work, I got a "CPU exceeded error" on my Bluehost account, which causes the entire site to shut down for about 5 or 10 minutes. Ugh.
What
I didn't want to give up on any of my data, even the old forum posts, so I decided to write a script that would copy the data over in batches. With configurable batch sizes, it should be able to stay within the CPU limits, and run quickly enough to not time out.
First I split the script into multiple pieces:
Currently the script requires that user IDs from phpBB get copied to identical user IDs into e107, which means that if there already exist users in e107, there will be collisions. This is why the entire user table must be wiped out. It's unfortunate, but the added complexity to map phpBB users to e107 users just isn't there.
For simplicity while first getting it working, I split the operations into multiple .php files. Eventually I would like to re-merge them into a single php file that handles all four functions.
Release
Since it is not yet "production quality," I do not want to distribute it to the masses and get lots of embarassing questions about it. But if you are really interested, PM me and I will send it to you.
I'm giving myself a deadline of Monday, January 28th to get it production-ready.
Another website I manage is currently using phpBB2 for its forums, and I am contemplating moving it over to e107. The problem is that the database is "large". According to the instructions on importing from phpBB2 to e107, the phpBB2 database must be trimmed down to 15Mb or smaller, else the import will timeout.
Now honestly, 15 Mb doesn't sound very large to me, not with the cost of today's storage and bandwidth. But they aren't kidding. I tried it anyway with my database of 21 Mb, ignoring the warning that trimming the database is "100% vital as the script will timeout otherwise". Not only did it not work, I got a "CPU exceeded error" on my Bluehost account, which causes the entire site to shut down for about 5 or 10 minutes. Ugh.
What
I didn't want to give up on any of my data, even the old forum posts, so I decided to write a script that would copy the data over in batches. With configurable batch sizes, it should be able to stay within the CPU limits, and run quickly enough to not time out.
First I split the script into multiple pieces:
- Deletes all user data from e107 (except for admin), all forums, subforums, and all posts
- Copies all forums from phpBB2 into e107 (all at once, no batches)
- Copies users (a batch at a time)
- Copies posts (a batch at a time)
Currently the script requires that user IDs from phpBB get copied to identical user IDs into e107, which means that if there already exist users in e107, there will be collisions. This is why the entire user table must be wiped out. It's unfortunate, but the added complexity to map phpBB users to e107 users just isn't there.
For simplicity while first getting it working, I split the operations into multiple .php files. Eventually I would like to re-merge them into a single php file that handles all four functions.
Release
Since it is not yet "production quality," I do not want to distribute it to the masses and get lots of embarassing questions about it. But if you are really interested, PM me and I will send it to you.
I'm giving myself a deadline of Monday, January 28th to get it production-ready.
What to expect here
Jamie Kawabata, Wednesday, January 23, 2008 - 01:10PM // comment: 0
I have decided to use this as a platform for a few purposes:
Nothing is here yet. But soon...
- This will be a space where I can announce what I'm doing and what I'm thinking about
- For certain things I create, this will be a place for me to post them
- This also serves as a place where I can get feedback
Nothing is here yet. But soon...