Posts tagged Programming Languages
So, I’d like to do some statistical analysis. I hear that R is really good at this. Let’s download it and take a look.
(Ten minutes later)
AAAHHH! MY EYES! THEY’RE BLEEDING!
What about Matlab? It’s the same story.1 As a programming languages person, these languages make me … well, angry.
Why?
Well, after thinking about this for a while, it seems to me that what I hate most about these languages is their complete lack of a small semantic core.
Take a language like Racket, JavaScript, Java, or C— these languages don’t have a heck of a lot in common, but they share
…
is this all just library design? Most of the things I really hate can easily be constructed in any dynamic library through a suitable application of
Terrible Library Design (tm)
… except that when it applies to things like vector dereference, it feels like fairly ‘core’ syntax.
…
Example time! First, R does this crazy thing in distinguishing logical from numeric vectors.
|
> a
[1] "a" "b" "c" "d"
> a[c(2,4,3)]
[1] "b" "d" "c"
> a[c(FALSE,TRUE)]
[1] "b" "d"
|
In the first of these two array derefs, we’re using the indices from the vector to decide what elements of a
to take. In the second case, though, the index expression is a ‘logical vector’ and is therefore tiled to the length of the original one, and used to decide whether to take the corresponding element.
If you imagine this as part of a language semantics, you’d see this horrible side-condition attached to these rules, where array deref’ing works in totally different ways depending on the kind of argument it gets.
To say nothing of the silent tiling, which seems like an open invitation to horrible bugs.
But wait, we can compound this problem with some nasty coercion:
|
> a[c(4,c(FALSE,TRUE,TRUE))]
[1] "d" "a" "a"
|
What on earth is going on here? First of all, vectors get silently flattened, so that c(3,c(4,5))
is the same as c(3,4,5)
— ugh — but then, the logical values are coerced into numeric ones, so the index vector that’s produced is actually c(4,0,1,1)
, which is then used to index the vector a
. But why are there only three values? Oh, well, there’s no index 0, so let’s just skip that one, shall we?
Honestly, I guess the real problem is in thinking of something like R as a programming language; it’s not. It’s a statistical analysis tool, with a rich text-based interface. After all, would I get upset if Photoshop used ‘b’ for blur and ‘s’ for sharpen and I couldn’t nest them the way that I wanted, using parentheses? Probably not.
And finally: apologies for everything I’ve written. I’ve used R for about fifteen minutes, and this piece is really just me blowing off a small amount of steam. Not well written, not well thought-out. Meh.
Macros are a wonderful thing. A hygienic macro system puts language extensions within the reach of non-compiler-hackers.
However, to date, most modern, hygienic macro systems are associated with languages like Scheme and Racket that are quite high-level. My claim is that macros are probably much more useful in low-level languages. Here’s why:
Being on sabbatical has given me a bit of experience with other systems and languages. Also, my kids are now old enough to “mess around” with programming. Learning from both of these, I’d like to hazard a bit of HtDP heresy: students should learn for i = 1 to 10
before they learn
To many of you, this may seem obvious. I’m not writing to you. Or maybe you folks can just read along and nod sagely.
HtDP takes this small and very lovely thing—recursive traversals over inductively defined data—and shows how it covers a huge piece of real estate. Really, if students could just understand how to write this class of programs effectively, they would have a vastly easier time with much of the rest of their programming careers, to say nothing of the remainder of their undergraduate tenure. Throw a few twists in there—a bit of mutation for efficiency, some memoization, some dynamic programming—and you’re pretty much done with the programming part of your first four years.
The sad thing is that many, many students make it through an entire four-year curriculum without ever really figuring out how to write a simple recursive traversal of an inductively defined data structure. This makes professors sad.
Among the Very Simple applications of this nice idea is that of “indexes.” That is, the natural numbers can be regarded as an inductively defined set, where a natural number is either 0 or the successor of a natural number. This allows you to regard any kind of indexing loop as simply a special case of … a recursive traversal of an inductively defined data structure.
So here’s the problem: in September, you face a bunch of bright-eyed, enthusiastic, deeply forgiving first-year college students. And you give them the recursive traversal of the inductively defined data structure. A very small number of them get it, and they’re off to the races. The rest of them struggle, and struggle, and finally get their teammates to help them write the code, and really wish they’d taken some other class.
NB: the rest of this makes less sense… even to me. Not finished.
However, another big part of the problem is … well, monads are like burritos.
Let me take a step back.
The notion of repeated action is a visceral and easily-understood one. Here’s what I mean. “A human can multiply a pair of 32-bit integers in about a minute. A computer can multiply 32-bit integers at a rate of several billion per second, or about a hundred billion times as fast as a person.” That’s an easily-understood claim: we understand what it means to the same thing a whole bunch of times really fast.
So, when I write
for i=[1..100] multiply_two_numbers();
It’s pretty easy to understand that I’m doing something one hundred times.
Is this post a thinly disguised ripoff of Brian Anderson’s post about embedding Rust in Ruby? Why yes. Yes it is.
Okay, let me start with a little background. Rust is a magnificent language that comes from Mozilla; it’s targeted at programmers who want
- high and predictable performance,
- control over memory layout,
- good support for concurrency, and
- safety.
I think the Mozilla Research homepage is probably the best place to start learning about Rust.
To be honest, though, I’m probably flattering myself if I think that this blog post is being read by anyone who doesn’t already know lots about Rust.
One of the key requirements of a language like Rust is that it be embeddable; that is, it should be possible to call Rust code from another language just as it’s possible to call C code from another language.
This is now possible.
To illustrate this, Brian Anderson posted a lovely example of embedding Rust in Ruby. But of course, embedding Rust in Ruby is pretty much exactly the same as embedding Rust in any other language.
Say, for instance, Racket.
So, without further ado, here’s the setup. You just happen to have a small web app written in Racket that performs a Gaussian Blur. You decide to optimize the performance by porting your code to Rust. Then you want to plug your Rust code into your Racket application. Done! Here’s the github repo that contains all of the code.
Let’s see that again in slow motion.
First, here’s the gaussian blur function, written in Racket. We’re going to stick with a grayscale image. It works fine in color, but the code is just that much harder to read.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33 |
|
Suppose we want to rewrite that in Rust. Here’s what it might look like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 |
fn blur_rust(width: uint, height: uint, data: &[u8]) -> ~[u8] {
let filter = [[0.011, 0.084, 0.011],
[0.084, 0.620, 0.084],
[0.011, 0.084, 0.011]];
let mut newdata = ~[];
for uint::range(0, height) |y| {
for uint::range(0, width) |x| {
let mut new_value = 0.0;
for uint::range(0, filter.len()) |yy| {
for uint::range(0, filter.len()) |xx| {
let x_sample = x - (filter.len() - 1) / 2 + xx;
let y_sample = y - (filter.len() - 1) / 2 + yy;
let sample_value = data[width * (y_sample % height) + (x_sample % width)];
let sample_value = sample_value as float;
let weight = filter[yy][xx];
new_value += sample_value * weight;
}
}
newdata.push(new_value as u8);
}
}
return newdata;
}
|
Pretty similar. Of course, it uses curly braces, so it runs about three times faster…
So: what kind of glue code is necessary to link the Rust code to the Racket code? Not a lot. On the Rust side, we need to create a pointer to the C data, then copy the result back into the source buffer when we’re done with the blur:
1
2
3
4
5
6
7
8
9
10
11
12 |
#[no_mangle]
pub extern fn blur(width: c_uint, height: c_uint, data: *mut u8) {
let width = width as uint;
let height = height as uint;
unsafe {
do vec::raw::mut_buf_as_slice(data, width * height) |data| {
let out_data = blur_rust(width, height, data);
vec::raw::copy_memory(data, out_data, width * height);
}
}
}
|
On the Racket side, it’s just a question of making an ffi call, which is super-concise:
And away you go!
I’ve got this code running live at FIXME. What’s that you say? You can’t seem to find FIXME?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54 |
#
# Globals:
#
# Regex to match balanced [brackets]. See Friedl's
# "Mastering Regular Expressions", 2nd Ed., pp. 328-331.
my $g_nested_brackets;
$g_nested_brackets = qr{
(?> # Atomic matching
[^\[\]]+ # Anything other than brackets
|
\[
(??{ $g_nested_brackets }) # Recursive set of nested brackets
\]
)*
}x;
# Table of hash values for escaped characters:
my %g_escape_table;
foreach my $char (split //, '\\`*_{}[]()>#+-.!') {
$g_escape_table{$char} = md5_hex($char);
}
# Global hashes, used by various utility routines
my %g_urls;
my %g_titles;
my %g_html_blocks;
# Used to track when we're inside an ordered or unordered list
# (see _ProcessListItems() for details):
my $g_list_level = 0;
#### Blosxom plug-in interface ##########################################
# Set $g_blosxom_use_meta to 1 to use Blosxom's meta plug-in to determine
# which posts Markdown should process, using a "meta-markup: markdown"
# header. If it's set to 0 (the default), Markdown will process all
# entries.
my $g_blosxom_use_meta = 0;
sub start { 1; }
sub story {
my($pkg, $path, $filename, $story_ref, $title_ref, $body_ref) = @_;
if ( (! $g_blosxom_use_meta) or
(defined($meta::markup) and ($meta::markup =~ /^\s*markdown\s*$/i))
){
$$body_ref = Markdown($$body_ref);
}
1;
}
|
{% gist 2411008 %}
{% gist 2411005 %}
{% gist 2411004 %}
{% gist 2411002 %}
{% gist 2410993 %}
{% gist 2410918 %}
{% gist 2410916 %}
{% gist 2410910 %}
{% gist 2410909 %}
{% gist 2410908 %}
{% gist 2410905 %}
{% gist 2410904 %}
{% gist 2410899 %}
{% gist 2410889 %}
{% gist 2410886 %}
{% gist 2410882 %}
{% gist 2410860 %}
{% gist 2410856 %}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62 |
<?php
namespace Symfony\Component\Form\Tests\Extension\Core\Type;
use Symfony\Component\Form\FormError;
class DateTimeTypeTest extends LocalizedTestCase
{
public function testSubmit_dateTime()
{
$form = $this->factory->create('datetime', null, array(
'data_timezone' => 'UTC',
'user_timezone' => 'UTC',
'date_widget' => 'choice',
'time_widget' => 'choice',
'input' => 'datetime',
));
$form->bind(array(
'date' => array(
'day' => '2',
'month' => '6',
'year' => '2010',
),
'time' => array(
'hour' => '3',
'minute' => '4',
),
));
$dateTime = new \DateTime('2010-06-02 03:04:00 UTC');
$this->assertDateTimeEquals($dateTime, $form->getData());
}
public function testSubmit_string()
{
$form = $this->factory->create('datetime', null, array(
'data_timezone' => 'UTC',
'user_timezone' => 'UTC',
'input' => 'string',
'date_widget' => 'choice',
'time_widget' => 'choice',
));
$form->bind(array(
'date' => array(
'day' => '2',
'month' => '6',
'year' => '2010',
),
'time' => array(
'hour' => '3',
'minute' => '4',
),
));
$this->assertEquals('2010-06-02 03:04:00', $form->getData());
}
...
}
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75 |
# Check for updates on initial load...
if [ "$DISABLE_AUTO_UPDATE" != "true" ]
then
/usr/bin/env ZSH=$ZSH zsh $ZSH/tools/check_for_upgrade.sh
fi
# Initializes Oh My Zsh
# add a function path
fpath=($ZSH/functions $ZSH/completions $fpath)
# Load all of the config files in ~/oh-my-zsh that end in .zsh
# TIP: Add files you don't want in git to .gitignore
for config_file ($ZSH/lib/*.zsh) source $config_file
# Set ZSH_CUSTOM to the path where your custom config files
# and plugins exists, or else we will use the default custom/
if [[ -z "$ZSH_CUSTOM" ]]; then
ZSH_CUSTOM="$ZSH/custom"
fi
is_plugin() {
local base_dir=$1
local name=$2
test -f $base_dir/plugins/$name/$name.plugin.zsh \
|| test -f $base_dir/plugins/$name/_$name
}
# Add all defined plugins to fpath. This must be done
# before running compinit.
for plugin ($plugins); do
if is_plugin $ZSH_CUSTOM $plugin; then
fpath=($ZSH_CUSTOM/plugins/$plugin $fpath)
elif is_plugin $ZSH $plugin; then
fpath=($ZSH/plugins/$plugin $fpath)
fi
done
# Load and run compinit
autoload -U compinit
compinit -i
# Load all of the plugins that were defined in ~/.zshrc
for plugin ($plugins); do
if [ -f $ZSH_CUSTOM/plugins/$plugin/$plugin.plugin.zsh ]; then
source $ZSH_CUSTOM/plugins/$plugin/$plugin.plugin.zsh
elif [ -f $ZSH/plugins/$plugin/$plugin.plugin.zsh ]; then
source $ZSH/plugins/$plugin/$plugin.plugin.zsh
fi
done
# Load all of your custom configurations from custom/
for config_file ($ZSH_CUSTOM/*.zsh) source $config_file
# Load the theme
if [ "$ZSH_THEME" = "random" ]
then
themes=($ZSH/themes/*zsh-theme)
N=${#themes[@]}
((N=(RANDOM%N)+1))
RANDOM_THEME=${themes[$N]}
source "$RANDOM_THEME"
echo "[oh-my-zsh] Random theme '$RANDOM_THEME' loaded..."
else
if [ ! "$ZSH_THEME" = "" ]
then
if [ -f "$ZSH/custom/$ZSH_THEME.zsh-theme" ]
then
source "$ZSH/custom/$ZSH_THEME.zsh-theme"
else
source "$ZSH/themes/$ZSH_THEME.zsh-theme"
fi
fi
fi
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55 |
class SelectDateWidget(Widget):
"""
A Widget that splits date input into three <select> boxes.
This also serves as an example of a Widget that has more than one HTML
element and hence implements value_from_datadict.
"""
none_value = (0, '---')
month_field = '%s_month'
day_field = '%s_day'
year_field = '%s_year'
def __init__(self, attrs=None, years=None, required=True):
# years is an optional list/tuple of years to use in the "year" select box.
self.attrs = attrs or {}
self.required = required
if years:
self.years = years
else:
this_year = datetime.date.today().year
self.years = range(this_year, this_year+10)
def render(self, name, value, attrs=None):
try:
year_val, month_val, day_val = value.year, value.month, value.day
except AttributeError:
year_val = month_val = day_val = None
if isinstance(value, basestring):
if settings.USE_L10N:
try:
input_format = get_format('DATE_INPUT_FORMATS')[0]
v = datetime.datetime.strptime(value, input_format)
year_val, month_val, day_val = v.year, v.month, v.day
except ValueError:
pass
else:
match = RE_DATE.match(value)
if match:
year_val, month_val, day_val = [int(v) for v in match.groups()]
choices = [(i, i) for i in self.years]
year_html = self.create_select(name, self.year_field, value, year_val, choices)
choices = MONTHS.items()
month_html = self.create_select(name, self.month_field, value, month_val, choices)
choices = [(i, i) for i in range(1, 32)]
day_html = self.create_select(name, self.day_field, value, day_val, choices)
output = []
for field in _parse_date_fmt():
if field == 'year':
output.append(year_html)
elif field == 'month':
output.append(month_html)
elif field == 'day':
output.append(day_html)
return mark_safe(u'\n'.join(output))
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46 |
def link
if f.linked_keg.directory? and f.linked_keg.realpath == f.prefix
opoo "This keg was marked linked already, continuing anyway"
# otherwise Keg.link will bail
f.linked_keg.unlink
end
keg = Keg.new(f.prefix)
keg.link
rescue Exception => e
onoe "The linking step did not complete successfully"
puts "The formula built, but is not symlinked into #{HOMEBREW_PREFIX}"
puts "You can try again using `brew link #{f.name}'"
keg.unlink
ohai e, e.backtrace if ARGV.debug?
@show_summary_heading = true
end
def fix_install_names
Keg.new(f.prefix).fix_install_names
rescue Exception => e
onoe "Failed to fix install names"
puts "The formula built, but you may encounter issues using it or linking other"
puts "formula against it."
ohai e, e.backtrace if ARGV.debug?
@show_summary_heading = true
end
def clean
require 'cleaner'
Cleaner.new f
rescue Exception => e
opoo "The cleaning step did not complete successfully"
puts "Still, the installation was successful, so we will link it into your prefix"
ohai e, e.backtrace if ARGV.debug?
@show_summary_heading = true
end
def pour
fetched, downloader = f.fetch
f.verify_download_integrity fetched, f.bottle_sha1, "SHA1"
HOMEBREW_CELLAR.cd do
downloader.stage
end
end
|