Bell Random Penguin JurassicNew media provocateur John Bell remixes famous book covers to show what they would have looked like if this year’s big literary merger had chosen the name Random Penguin.

Continue reading »

Bits Projectglass Tmag ArticleAugmented Reality just got a lot more fashionable. That wasn’t hard, as the previous standard was pretty much Geordi La Forge‘s automotive-filter visor.

That said, Google Glass looks pretty serious–like having Siri behind your eyeballs.

Continue reading »

The biting video promo for “Fotoshop by Adobé” (pronounced a-do-BEY) imagines the popular image editor marketed by Revlon et al. The scary thing is how close the video is to reality.

Continue reading »

How long before the Occupy line of cosmetics hits Bloomingdales?

This project reminds me of a Heath Bunting proposal to paint anamorphic pictures of people on the ground in front of security cameras to confuse their operators.

“A New York-based designer has created a camouflage technique that makes it much harder for computer based facial recognition. Along with the growth of closed circuit television (CCTV) , this has become quite a concern for many around the world, especially in the UK where being on camera is simply a part of city life. Being recognized automatically by computer is something that hearkens back to 1984 or A Scanner Darkly. As we move further into the 21st century, this futuristic techno-horror fiction is seeming more and more accurate. Never fear though people, CV Dazzle has some styling and makeup ideas that will make you invisible to facial recognition cameras. Why the ‘fabulous’ name? It comes from World War I warship paint that used stark geometric patterning to help break up the obvious outline of the vessel. Apparently it all began as a thesis at the Interactive Telecommunications Program at New York University. It addressed the problems with traditional techniques of hiding the face, like masks and sunglasses and looked into more socially and legally acceptable ways of styling that could prevent a computer from recognizing your face. Fans of Assassin’s Creed might feel a bit at home with this, as it’s all about hiding in plain sight.”

http://yro.slashdot.org/story/12/01/04/2017215/avoiding-facial-recognition-of-the-future

Meanwhile, for those times when you want to get your face out on your terms, protestors have taken to occupying the sky.

Meet the Occu-Copter. The live-streaming media stars of the Occupy movement are using cheap technology to provide streaming coverage of protest events from the air – challenging the big budgets of mainstream TV news stations.

http://feeds.wired.com/~r/wired/index/~3/lUTUeCDb9O4/

Is that guy in the weight-loss ad really as buff as he looks? How far can you enhance that snapshot for the school newspaper and still have it represent reality? This software tool rates photographs on how far they have been manipulated.

The photographs of celebrities and models in fashion advertisements and magazines are routinely buffed with a helping of digital polish. The retouching can be slight — colors brightened, a stray hair put in place, a pimple healed. Or it can be drastic — shedding 10 or 20 pounds, adding a few inches in height and erasing all wrinkles and blemishes, done using Adobe’s Photoshop software, the photo retoucher’s magic wand.

“Fix one thing, then another and pretty soon you end up with Barbie,” said Hany Farid, a professor of computer science and a digital forensics expert at Dartmouth.

And that is a problem, feminist legislators in France, Britain and Norway say, and they want digitally altered photos to be labeled. In June, the American Medical Association adopted a policy on body image and advertising that urged advertisers and others to “discourage the altering of photographs in a manner that could promote unrealistic expectations of appropriate body image.”

Dr. Farid said he became intrigued by the problem after reading about the photo-labeling proposals in Europe. Categorizing photos as either altered or not altered seemed too blunt an approach, he said.

Dr. Farid and Eric Kee, a Ph.D. student in computer science at Dartmouth, are proposing a software tool for measuring how much fashion and beauty photos have been altered, a 1-to-5 scale that distinguishes the infinitesimal from the fantastic. Their research is being published this week in a scholarly journal, The Proceedings of the National Academy of Sciences….

From left to right, photographs show the five levels of retouching….The effect, from slight to drastic, may discourage retouching. “Models, for example, might well say, ‘I don’t want to be a 5. I want to be a 1,’ ” he said.

http://feeds.nytimes.com/click.phdo?i=ef370676c5e736ec41a3325885e56f55

This is a pretty cool product that allows users to very quickly create website and moblile phone mockups.  It uses a very simple drag-and-drop interface.  The actual product is $79, but I used this free demo to create the mockups I wanted, then I took screen shots of them.

http://builds.balsamiq.com/b/mockups-web-demo/

Demonstrating the power of many-to-many image- and sound-making, artist Aaron Koblin and his collaborators stitch compelling interfaces from huge data sets. Watch Koblin transform airline flight data into global travel patterns, frame-by-frame drawings into an animated tribute to Johnny Cash, and Google Street View into an Arcade Fire video personalized for each listener.


Continue reading »

At the same time that the Obama administration is underwriting hardware for helping citizens of other countries circumvent their own government’s Internet censorship, Apple is patenting a camera that performs a government’s censorship for it.

Continue reading »

Iphone Tracker Ippolito 2011 01 19Privacy advocates Alasdair Allan and Pete Warden have released a free visualization tool to demonstrate how the iPhone stores your movements in a file easily accessible by anyone with access to your phone or computer. (Shown here, my January 19th presence in the Philadelphia International Airport.)

Nothing like a good visualization of your own movements to give you the creeps.

Continue reading »

Kinect eye-viewFacebook quietly rolled out face recognition in its photo service earlier this year, prompting some to speculate that Facebook users might soon get ads correlated to what they look like or where their pictures appear. But Facebook may not be the only one targeting ads according to what the lens sees. Last month Microsoft’s Chief Financial Officer for interactive entertainment let slip that Kinect’s camera feed offered his company “a bunch of new business opportunities.”

What sort of business opportunities? Well, once their system is trained on actual faces thanks to tags from its own users, Microsoft or Facebook could sell Haar classifiers to other companies for ad targeting (think X-box ads for acne cream) or the government for surveillance (think a “Total Information Awareness” database of every person ever caught on a security camera).

Of course, as new media artist and innovator Mark Daggett pointed out to me, this face-harvesting could have productive applications, such as an iPhone app that scans a crowd and displays each person’s Facebook profile above their heads. Then again, it could have detrimental applications, such as an iPhone app that scans a crowd and displays each person’s Facebook profile above their heads.

Continue reading »

Bookmark this category

“If you’re looking for reasons to upgrade to Photoshop CS5 when it arrives, a new demo video might just persuade you. Narrated by Bryan O’Neil-Hughes, a product manager on the Photoshop team, the video shows the new content-aware fill tool, which has the potential to revolutionise the way you clean up photos. If you’re not happy with an item in your picture, select it, delete it, and Photoshop will analyse the surrounding area and plug the gap as if it never existed.”

http://tech.slashdot.org/story/10/03/24/1725246/Photoshop-CS5s-Showpiece-mdash-Content-Aware-Fill?from=rss&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+Slashdot%2Fslashdot+%28Slashdot%29

“If you’re looking for reasons to upgrade to Photoshop CS5 when it arrives, a new demo video might just persuade you. Narrated by Bryan O’Neil-Hughes, a product manager on the Photoshop team, the video shows the new content-aware fill tool, which has the potential to revolutionise the way you clean up photos. If you’re not happy with an item in your picture, select it, delete it, and Photoshop will analyse the surrounding area and plug the gap as if it never existed.”

http://tech.slashdot.org/story/10/03/24/1725246/Photoshop-CS5s-Showpiece-mdash-Content-Aware-Fill?from=rss&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+Slashdot%2Fslashdot+%28Slashdot%29

Video demo

MIT news article (copied below)

January 19, 2010

Until the 1980s, using a computer program meant memorizing a lot of commands and typing them in a line at a time, only to get lines of text back. The graphical user interface, or GUI, changed that. By representing programs, program functions, and data as two-dimensional images–like icons, buttons and windows–the GUI made intuitive and spatial what had been memory intensive and laborious.

But while the GUI made things easier for computer users, it didn’t make them any easier for computer programmers. Underlying GUI components is a lot of computer code, and usually, building or customizing a program, or getting different programs to work together, still means manipulating that code. Researchers in MIT’s Computer Science and Artificial Intelligence Lab hope to change that, with a system that allows people to write programs using screen shots of GUIs. Ultimately, the system could allow casual computer users to create their own programs without having to master a programming language.

The system, designed by associate professor Rob Miller, grad student Tsung-Hsiang Chang, and the University of Maryland’s Tom Yeh, is called Sikuli, which means “God’s eye” in the language of Mexico’s Huichol Indians. In a paper that won the best-student-paper award at the Association for Computing Machinery’s User Interface Software and Technology conference last year, the researchers showed how Sikuli could aid in the construction of “scripts,” short programs that combine or extend the functionality of other programs. Using the system requires some familiarity with the common scripting language Python. But it requires no knowledge of the code underlying the programs whose functionality is being combined or extended. When the programmer wants to invoke the functionality of one of those programs, she simply draws a box around the associated GUI, clicks the mouse to capture a screen shot, and inserts the screen shot directly into a line of Python code.

Suppose, for instance, that a Python programmer wants to write a script that automatically sends a message to her cell phone when the bus she takes to work rounds a particular corner. If the transportation authority maintains a web site that depicts the bus’s progress as a moving pin on a Google map, the programmer can specify that the message should be sent when the pin enters a particular map region. Instead of using arcane terminology to describe the pin, or specifying the geographical coordinates of the map region’s boundaries, the programmer can simply plug screen shots into the script: when this (the pin) gets here (the corner), send me a text.

“When I saw that, I thought, “Oh my God, you can do that??” says Allen Cypher, a researcher at IBM’s Almaden Research Center who specializes in human-computer interactions. “I certainly never thought that you could do anything like that. Not only do they do it; they do it well. It’s already practical. I want to use it right away to do things I couldn’t do before.?

In the same paper, the researchers also presented a Sikuli application aimed at a broader audience. A computer user hoping to learn how to use an obscure feature of a computer program could use a screen shot of a GUI–say, the button that depicts a lasso in Adobe Photoshop–to search for related content on the web. In an experiment that allowed people to use the system over the web, the researchers found that the visual approach cut in half the time it took for users to find useful content.

In the same way that a programmer using Sikuli doesn’t need to know anything about the code underlying a GUI, Sikuli doesn’t know anything about it, either. Instead, it uses computer vision algorithms to analyze what’s happening on-screen. “It’s a software agent that looks at the screen the way humans do,” Miller says. That means that without any additional modification, Sikuli can work with any program that has a graphical interface. It doesn’t have to translate between different file formats or computer languages because, like a human, it’s just looking at pixels on the screen.

In a new paper to be presented this spring at CHI, the premier conference on human-computer interactions, the researchers describe a new application of Sikuli, aimed at programmers working on large software development projects. On such projects, new code accumulates every day, and any line of it could cause a previously developed GUI to function improperly. Ideally, after a day’s work, testers would run through the entire application, clicking virtual buttons and making sure that the right windows or icons still pop up. Since that would be prohibitively time consuming, however, broken GUIs may not be detected until the application has begun the long and costly process of quality assurance testing.

The new Sikuli application, however, lets programmers create scripts that automatically test an application’s GUI components. Visually specifying both the GUI and the window it’s supposed to pull up makes writing the scripts much easier; and once written, they can be run every night without further modification.

But the new application has an added feature that’s particularly heartening to non-programmers. Like its predecessors, it allows users to write their scripts –in this case, GUI tests–in Python. But of course, writing scripts in Python still requires some knowledge of Python–at the very least, an understanding of how to use commands like “dragDrop” or “assertNotExist,” which describe how the GUI components should be handled.

The new application gives programmers the alternative of simply recording the series of keystrokes and mouse clicks that define the test procedure. For instance, instead of typing a line of code that includes the command “dragDrop,” the programmer can simply record the act of dragging a file. The system automatically generates the corresponding Python code, which will include a cropped screen shot of the sample file; but if she chooses, the programmer can reuse the code while plugging in screen shots of other GUIs. And that points toward a future version of Sikuli that would require knowledge neither of the code underlying particular applications nor of a scripting language like Python, giving ordinary computer users the ability to intuitively create programs that mediate between other applications.

OK, there’s a little bit of code, but mostly pictures. And you could actually use it to do some useful things.

Sikuli, a programming language you can patch together from screenshots, is more visual and more basic than Visual Basic–and you can use it on a Mac.

Video demo

MIT news article (copied below)

January 19, 2010

Until the 1980s, using a computer program meant memorizing a lot of commands and typing them in a line at a time, only to get lines of text back. The graphical user interface, or GUI, changed that. By representing programs, program functions, and data as two-dimensional images–like icons, buttons and windows–the GUI made intuitive and spatial what had been memory intensive and laborious.

But while the GUI made things easier for computer users, it didn’t make them any easier for computer programmers. Underlying GUI components is a lot of computer code, and usually, building or customizing a program, or getting different programs to work together, still means manipulating that code. Researchers in MIT’s Computer Science and Artificial Intelligence Lab hope to change that, with a system that allows people to write programs using screen shots of GUIs. Ultimately, the system could allow casual computer users to create their own programs without having to master a programming language.

The system, designed by associate professor Rob Miller, grad student Tsung-Hsiang Chang, and the University of Maryland’s Tom Yeh, is called Sikuli, which means “God’s eye” in the language of Mexico’s Huichol Indians. In a paper that won the best-student-paper award at the Association for Computing Machinery’s User Interface Software and Technology conference last year, the researchers showed how Sikuli could aid in the construction of “scripts,” short programs that combine or extend the functionality of other programs. Using the system requires some familiarity with the common scripting language Python. But it requires no knowledge of the code underlying the programs whose functionality is being combined or extended. When the programmer wants to invoke the functionality of one of those programs, she simply draws a box around the associated GUI, clicks the mouse to capture a screen shot, and inserts the screen shot directly into a line of Python code.

Suppose, for instance, that a Python programmer wants to write a script that automatically sends a message to her cell phone when the bus she takes to work rounds a particular corner. If the transportation authority maintains a web site that depicts the bus’s progress as a moving pin on a Google map, the programmer can specify that the message should be sent when the pin enters a particular map region. Instead of using arcane terminology to describe the pin, or specifying the geographical coordinates of the map region’s boundaries, the programmer can simply plug screen shots into the script: when this (the pin) gets here (the corner), send me a text.

“When I saw that, I thought, “Oh my God, you can do that??” says Allen Cypher, a researcher at IBM’s Almaden Research Center who specializes in human-computer interactions. “I certainly never thought that you could do anything like that. Not only do they do it; they do it well. It’s already practical. I want to use it right away to do things I couldn’t do before.?

In the same paper, the researchers also presented a Sikuli application aimed at a broader audience. A computer user hoping to learn how to use an obscure feature of a computer program could use a screen shot of a GUI–say, the button that depicts a lasso in Adobe Photoshop–to search for related content on the web. In an experiment that allowed people to use the system over the web, the researchers found that the visual approach cut in half the time it took for users to find useful content.

In the same way that a programmer using Sikuli doesn’t need to know anything about the code underlying a GUI, Sikuli doesn’t know anything about it, either. Instead, it uses computer vision algorithms to analyze what’s happening on-screen. “It’s a software agent that looks at the screen the way humans do,” Miller says. That means that without any additional modification, Sikuli can work with any program that has a graphical interface. It doesn’t have to translate between different file formats or computer languages because, like a human, it’s just looking at pixels on the screen.

In a new paper to be presented this spring at CHI, the premier conference on human-computer interactions, the researchers describe a new application of Sikuli, aimed at programmers working on large software development projects. On such projects, new code accumulates every day, and any line of it could cause a previously developed GUI to function improperly. Ideally, after a day’s work, testers would run through the entire application, clicking virtual buttons and making sure that the right windows or icons still pop up. Since that would be prohibitively time consuming, however, broken GUIs may not be detected until the application has begun the long and costly process of quality assurance testing.

The new Sikuli application, however, lets programmers create scripts that automatically test an application’s GUI components. Visually specifying both the GUI and the window it’s supposed to pull up makes writing the scripts much easier; and once written, they can be run every night without further modification.

But the new application has an added feature that’s particularly heartening to non-programmers. Like its predecessors, it allows users to write their scripts –in this case, GUI tests–in Python. But of course, writing scripts in Python still requires some knowledge of Python–at the very least, an understanding of how to use commands like “dragDrop” or “assertNotExist,” which describe how the GUI components should be handled.

The new application gives programmers the alternative of simply recording the series of keystrokes and mouse clicks that define the test procedure. For instance, instead of typing a line of code that includes the command “dragDrop,” the programmer can simply record the act of dragging a file. The system automatically generates the corresponding Python code, which will include a cropped screen shot of the sample file; but if she chooses, the programmer can reuse the code while plugging in screen shots of other GUIs. And that points toward a future version of Sikuli that would require knowledge neither of the code underlying particular applications nor of a scripting language like Python, giving ordinary computer users the ability to intuitively create programs that mediate between other applications.

Video demo

MIT news article (copied below)

January 19, 2010

Until the 1980s, using a computer program meant memorizing a lot of commands and typing them in a line at a time, only to get lines of text back. The graphical user interface, or GUI, changed that. By representing programs, program functions, and data as two-dimensional images–like icons, buttons and windows–the GUI made intuitive and spatial what had been memory intensive and laborious.

But while the GUI made things easier for computer users, it didn’t make them any easier for computer programmers. Underlying GUI components is a lot of computer code, and usually, building or customizing a program, or getting different programs to work together, still means manipulating that code. Researchers in MIT’s Computer Science and Artificial Intelligence Lab hope to change that, with a system that allows people to write programs using screen shots of GUIs. Ultimately, the system could allow casual computer users to create their own programs without having to master a programming language.

The system, designed by associate professor Rob Miller, grad student Tsung-Hsiang Chang, and the University of Maryland’s Tom Yeh, is called Sikuli, which means “God’s eye” in the language of Mexico’s Huichol Indians. In a paper that won the best-student-paper award at the Association for Computing Machinery’s User Interface Software and Technology conference last year, the researchers showed how Sikuli could aid in the construction of “scripts,” short programs that combine or extend the functionality of other programs. Using the system requires some familiarity with the common scripting language Python. But it requires no knowledge of the code underlying the programs whose functionality is being combined or extended. When the programmer wants to invoke the functionality of one of those programs, she simply draws a box around the associated GUI, clicks the mouse to capture a screen shot, and inserts the screen shot directly into a line of Python code.

Suppose, for instance, that a Python programmer wants to write a script that automatically sends a message to her cell phone when the bus she takes to work rounds a particular corner. If the transportation authority maintains a web site that depicts the bus’s progress as a moving pin on a Google map, the programmer can specify that the message should be sent when the pin enters a particular map region. Instead of using arcane terminology to describe the pin, or specifying the geographical coordinates of the map region’s boundaries, the programmer can simply plug screen shots into the script: when this (the pin) gets here (the corner), send me a text.

“When I saw that, I thought, “Oh my God, you can do that??” says Allen Cypher, a researcher at IBM’s Almaden Research Center who specializes in human-computer interactions. “I certainly never thought that you could do anything like that. Not only do they do it; they do it well. It’s already practical. I want to use it right away to do things I couldn’t do before.?

In the same paper, the researchers also presented a Sikuli application aimed at a broader audience. A computer user hoping to learn how to use an obscure feature of a computer program could use a screen shot of a GUI–say, the button that depicts a lasso in Adobe Photoshop–to search for related content on the web. In an experiment that allowed people to use the system over the web, the researchers found that the visual approach cut in half the time it took for users to find useful content.

In the same way that a programmer using Sikuli doesn’t need to know anything about the code underlying a GUI, Sikuli doesn’t know anything about it, either. Instead, it uses computer vision algorithms to analyze what’s happening on-screen. “It’s a software agent that looks at the screen the way humans do,” Miller says. That means that without any additional modification, Sikuli can work with any program that has a graphical interface. It doesn’t have to translate between different file formats or computer languages because, like a human, it’s just looking at pixels on the screen.

In a new paper to be presented this spring at CHI, the premier conference on human-computer interactions, the researchers describe a new application of Sikuli, aimed at programmers working on large software development projects. On such projects, new code accumulates every day, and any line of it could cause a previously developed GUI to function improperly. Ideally, after a day’s work, testers would run through the entire application, clicking virtual buttons and making sure that the right windows or icons still pop up. Since that would be prohibitively time consuming, however, broken GUIs may not be detected until the application has begun the long and costly process of quality assurance testing.

The new Sikuli application, however, lets programmers create scripts that automatically test an application’s GUI components. Visually specifying both the GUI and the window it’s supposed to pull up makes writing the scripts much easier; and once written, they can be run every night without further modification.

But the new application has an added feature that’s particularly heartening to non-programmers. Like its predecessors, it allows users to write their scripts –in this case, GUI tests–in Python. But of course, writing scripts in Python still requires some knowledge of Python–at the very least, an understanding of how to use commands like “dragDrop” or “assertNotExist,” which describe how the GUI components should be handled.

The new application gives programmers the alternative of simply recording the series of keystrokes and mouse clicks that define the test procedure. For instance, instead of typing a line of code that includes the command “dragDrop,” the programmer can simply record the act of dragging a file. The system automatically generates the corresponding Python code, which will include a cropped screen shot of the sample file; but if she chooses, the programmer can reuse the code while plugging in screen shots of other GUIs. And that points toward a future version of Sikuli that would require knowledge neither of the code underlying particular applications nor of a scripting language like Python, giving ordinary computer users the ability to intuitively create programs that mediate between other applications.

OK, there’s a little bit of code, but mostly pictures. And you could actually use it to do some useful things.

Sikuli, a programming language you can patch together from screenshots, is more visual and more basic than Visual Basic–and you can use it on a Mac.

Video demo

MIT news article (copied below)

January 19, 2010

Until the 1980s, using a computer program meant memorizing a lot of commands and typing them in a line at a time, only to get lines of text back. The graphical user interface, or GUI, changed that. By representing programs, program functions, and data as two-dimensional images–like icons, buttons and windows–the GUI made intuitive and spatial what had been memory intensive and laborious.

But while the GUI made things easier for computer users, it didn’t make them any easier for computer programmers. Underlying GUI components is a lot of computer code, and usually, building or customizing a program, or getting different programs to work together, still means manipulating that code. Researchers in MIT’s Computer Science and Artificial Intelligence Lab hope to change that, with a system that allows people to write programs using screen shots of GUIs. Ultimately, the system could allow casual computer users to create their own programs without having to master a programming language.

The system, designed by associate professor Rob Miller, grad student Tsung-Hsiang Chang, and the University of Maryland’s Tom Yeh, is called Sikuli, which means “God’s eye” in the language of Mexico’s Huichol Indians. In a paper that won the best-student-paper award at the Association for Computing Machinery’s User Interface Software and Technology conference last year, the researchers showed how Sikuli could aid in the construction of “scripts,” short programs that combine or extend the functionality of other programs. Using the system requires some familiarity with the common scripting language Python. But it requires no knowledge of the code underlying the programs whose functionality is being combined or extended. When the programmer wants to invoke the functionality of one of those programs, she simply draws a box around the associated GUI, clicks the mouse to capture a screen shot, and inserts the screen shot directly into a line of Python code.

Suppose, for instance, that a Python programmer wants to write a script that automatically sends a message to her cell phone when the bus she takes to work rounds a particular corner. If the transportation authority maintains a web site that depicts the bus’s progress as a moving pin on a Google map, the programmer can specify that the message should be sent when the pin enters a particular map region. Instead of using arcane terminology to describe the pin, or specifying the geographical coordinates of the map region’s boundaries, the programmer can simply plug screen shots into the script: when this (the pin) gets here (the corner), send me a text.

“When I saw that, I thought, “Oh my God, you can do that??” says Allen Cypher, a researcher at IBM’s Almaden Research Center who specializes in human-computer interactions. “I certainly never thought that you could do anything like that. Not only do they do it; they do it well. It’s already practical. I want to use it right away to do things I couldn’t do before.?

In the same paper, the researchers also presented a Sikuli application aimed at a broader audience. A computer user hoping to learn how to use an obscure feature of a computer program could use a screen shot of a GUI–say, the button that depicts a lasso in Adobe Photoshop–to search for related content on the web. In an experiment that allowed people to use the system over the web, the researchers found that the visual approach cut in half the time it took for users to find useful content.

In the same way that a programmer using Sikuli doesn’t need to know anything about the code underlying a GUI, Sikuli doesn’t know anything about it, either. Instead, it uses computer vision algorithms to analyze what’s happening on-screen. “It’s a software agent that looks at the screen the way humans do,” Miller says. That means that without any additional modification, Sikuli can work with any program that has a graphical interface. It doesn’t have to translate between different file formats or computer languages because, like a human, it’s just looking at pixels on the screen.

In a new paper to be presented this spring at CHI, the premier conference on human-computer interactions, the researchers describe a new application of Sikuli, aimed at programmers working on large software development projects. On such projects, new code accumulates every day, and any line of it could cause a previously developed GUI to function improperly. Ideally, after a day’s work, testers would run through the entire application, clicking virtual buttons and making sure that the right windows or icons still pop up. Since that would be prohibitively time consuming, however, broken GUIs may not be detected until the application has begun the long and costly process of quality assurance testing.

The new Sikuli application, however, lets programmers create scripts that automatically test an application’s GUI components. Visually specifying both the GUI and the window it’s supposed to pull up makes writing the scripts much easier; and once written, they can be run every night without further modification.

But the new application has an added feature that’s particularly heartening to non-programmers. Like its predecessors, it allows users to write their scripts –in this case, GUI tests–in Python. But of course, writing scripts in Python still requires some knowledge of Python–at the very least, an understanding of how to use commands like “dragDrop” or “assertNotExist,” which describe how the GUI components should be handled.

The new application gives programmers the alternative of simply recording the series of keystrokes and mouse clicks that define the test procedure. For instance, instead of typing a line of code that includes the command “dragDrop,” the programmer can simply record the act of dragging a file. The system automatically generates the corresponding Python code, which will include a cropped screen shot of the sample file; but if she chooses, the programmer can reuse the code while plugging in screen shots of other GUIs. And that points toward a future version of Sikuli that would require knowledge neither of the code underlying particular applications nor of a scripting language like Python, giving ordinary computer users the ability to intuitively create programs that mediate between other applications.

sikulaOK, there’s a little bit of code, but mostly pictures. And you could actually use it to do some useful things.

Sikuli, a programming language you can patch together from screenshots, is more visual and more basic than Visual Basic–and you can use it on a Mac.

Video demo

MIT news article (copied below)

January 19, 2010

Until the 1980s, using a computer program meant memorizing a lot of commands and typing them in a line at a time, only to get lines of text back. The graphical user interface, or GUI, changed that. By representing programs, program functions, and data as two-dimensional images–like icons, buttons and windows–the GUI made intuitive and spatial what had been memory intensive and laborious.

But while the GUI made things easier for computer users, it didn’t make them any easier for computer programmers. Underlying GUI components is a lot of computer code, and usually, building or customizing a program, or getting different programs to work together, still means manipulating that code. Researchers in MIT’s Computer Science and Artificial Intelligence Lab hope to change that, with a system that allows people to write programs using screen shots of GUIs. Ultimately, the system could allow casual computer users to create their own programs without having to master a programming language.

The system, designed by associate professor Rob Miller, grad student Tsung-Hsiang Chang, and the University of Maryland’s Tom Yeh, is called Sikuli, which means “God’s eye” in the language of Mexico’s Huichol Indians. In a paper that won the best-student-paper award at the Association for Computing Machinery’s User Interface Software and Technology conference last year, the researchers showed how Sikuli could aid in the construction of “scripts,” short programs that combine or extend the functionality of other programs. Using the system requires some familiarity with the common scripting language Python. But it requires no knowledge of the code underlying the programs whose functionality is being combined or extended. When the programmer wants to invoke the functionality of one of those programs, she simply draws a box around the associated GUI, clicks the mouse to capture a screen shot, and inserts the screen shot directly into a line of Python code.

Suppose, for instance, that a Python programmer wants to write a script that automatically sends a message to her cell phone when the bus she takes to work rounds a particular corner. If the transportation authority maintains a web site that depicts the bus’s progress as a moving pin on a Google map, the programmer can specify that the message should be sent when the pin enters a particular map region. Instead of using arcane terminology to describe the pin, or specifying the geographical coordinates of the map region’s boundaries, the programmer can simply plug screen shots into the script: when this (the pin) gets here (the corner), send me a text.

“When I saw that, I thought, “Oh my God, you can do that??” says Allen Cypher, a researcher at IBM’s Almaden Research Center who specializes in human-computer interactions. “I certainly never thought that you could do anything like that. Not only do they do it; they do it well. It’s already practical. I want to use it right away to do things I couldn’t do before.?

In the same paper, the researchers also presented a Sikuli application aimed at a broader audience. A computer user hoping to learn how to use an obscure feature of a computer program could use a screen shot of a GUI–say, the button that depicts a lasso in Adobe Photoshop–to search for related content on the web. In an experiment that allowed people to use the system over the web, the researchers found that the visual approach cut in half the time it took for users to find useful content.

In the same way that a programmer using Sikuli doesn’t need to know anything about the code underlying a GUI, Sikuli doesn’t know anything about it, either. Instead, it uses computer vision algorithms to analyze what’s happening on-screen. “It’s a software agent that looks at the screen the way humans do,” Miller says. That means that without any additional modification, Sikuli can work with any program that has a graphical interface. It doesn’t have to translate between different file formats or computer languages because, like a human, it’s just looking at pixels on the screen.

In a new paper to be presented this spring at CHI, the premier conference on human-computer interactions, the researchers describe a new application of Sikuli, aimed at programmers working on large software development projects. On such projects, new code accumulates every day, and any line of it could cause a previously developed GUI to function improperly. Ideally, after a day’s work, testers would run through the entire application, clicking virtual buttons and making sure that the right windows or icons still pop up. Since that would be prohibitively time consuming, however, broken GUIs may not be detected until the application has begun the long and costly process of quality assurance testing.

The new Sikuli application, however, lets programmers create scripts that automatically test an application’s GUI components. Visually specifying both the GUI and the window it’s supposed to pull up makes writing the scripts much easier; and once written, they can be run every night without further modification.

But the new application has an added feature that’s particularly heartening to non-programmers. Like its predecessors, it allows users to write their scripts –in this case, GUI tests–in Python. But of course, writing scripts in Python still requires some knowledge of Python–at the very least, an understanding of how to use commands like “dragDrop” or “assertNotExist,” which describe how the GUI components should be handled.

The new application gives programmers the alternative of simply recording the series of keystrokes and mouse clicks that define the test procedure. For instance, instead of typing a line of code that includes the command “dragDrop,” the programmer can simply record the act of dragging a file. The system automatically generates the corresponding Python code, which will include a cropped screen shot of the sample file; but if she chooses, the programmer can reuse the code while plugging in screen shots of other GUIs. And that points toward a future version of Sikuli that would require knowledge neither of the code underlying particular applications nor of a scripting language like Python, giving ordinary computer users the ability to intuitively create programs that mediate between other applications.

© 2011 UMaine NMDNet Suffusion theme by Sayontan Sinha