Monthly Archives: July 2011

ManyToMany Relationships with Intermediary Models in Django

So far, my adventures with Django have been pretty exciting. I’ve created models and forms based on those. I’ve overridden default ModelForm methods to inject extra logic. I’ve learned a lot about Django, and by proxy, Python. Today, I came across a feature (or lack-of, in my opinion) that I didn’t fully enjoy.

For most Many-to-many relationships, all the intermediary table needs to contain are model1_id and model2_id fields. Django supports this, by default, creating a ManyToMany field on a model will generate this intermediary table for you. By doing this, you are able to assign models to the relationship via model1.m2m_field.add(model2). This also registers the relationship in any ModelForm subclass that uses model1 so that when you call model1.save(), all related intermediary rows are inserted as well.

This works great most of the time. But what if you need extra information to be contained within that intermediary table, and therefore, in the intermediary model?

Django allows you to specify an intermediary model when describing ManyToMany relationships. An example of the way I’ve used it is:

class Module(Model):
    # ...

class Ammo(Model):
    # ...

class Fitting(Model):
    # ...
    fitted_modules = ManyToMany(Module, through = 'FittedModule')

class FittedModule(Model):
    fitting = ForeignKey(Fitting)
    module = ForeignKey(Module)
    count = IntegerField(default = 1)
    ammo = ForeignKey(Ammo, null = True, blank = True)

As you can see, a Fitting has fitted Modules. I need extra information related to the FittedModule: how many modules and what ammo, if any, is loaded. Modules are static information, while Fittings are created by users.

This works perfectly fine. I can create a Fitting, save it, create a FittedModule and assign a Module and the new Fitting and save it. I can do this for multiple FittedModules, and I am able to then access all related Modules via the ManyToMany field on the Fitting model.

Now, normally you would use a FormSet of some type to create a Fitting and FittedModules at the same time. What if you don’t want to or can’t use a FormSet? In my example, I need to create a Fitting, complete with FittedModules, from just a single text field. Basically, the text field contains one Fitted Module per line. My FittingForm:

class FittingForm(models.ModelForm):
    # ...

    fitting_data = forms.CharField(label = 'Paste your fitting below', max_length = 8096, widget = forms.Textarea)

    class Meta:
        model = Fitting
        exclude = ('fitted_modules')

Essentially, the fitting_data field is a proxy for the fitted_modules field on the model. Somehow, I need some logic somewhere to make this conversion. In my mind, this should go somewhere in the Form, rather than in the view or somewhere else. To summarize, I’ve overridden the _post_clean() method (naughty, I know) because I need this to occur during the first round of validation – not after validation is complete as in the clean() method.

In the _post_clean() method, I access the instance property (which is an instance of Fitting) and set the properties directly. This is what normally happens when _post_clean() is called, so it only feels right. The problem starts when I try to save Fitted Modules.

class FittingForm(models.ModelForm):
    # ...

    def _post_clean(self):
        super(models.ModelForm, self)._post_clean()

        fitting = self.instance

        # some logic to get a Module from a line in fitting_data

        fitted_module = Fitted_Module()
        fitted_module.module = module
        fitted_module.fitting = fitting

Note the last line. The problem here is that fitting is not yet saved and therefore, fitting_id is not set on the new Fitted_Module model object. If you attempt to call fitted_module.save() you’ll throw an exception from the db layer stating that the fitted_module cannot be saved because fitting_id cannot be null. Even if you call fitted_module.save() after the fitting is saved, you’ll get this error. fitting_id is not a reference to the pk property of the fitting, rather, fitting_id is set at the time of setting the fitted_module.fitting property. At least, it seems to work that way.

So, what I’ve done is created a new list within the Form instance to keep track of intermediary models created while the form was cleaning. When save() is called on the form, the Fitting instance is saved, unless commit = False, it will iterate through the list of intermediary models, re-setting each of the foreign key property, and then saving the model. If commit = False, then, similar to the save_m2m() method that is created, a save_related() method is created and attached to the form instance.

class MyForm(models.ModelForm):
    def __init__(self, *args, **kwargs):
		self.save_after_self = []

		super(models.ModelForm, self).__init__(*args, **kwargs)

	def save(self, commit = True):
		super(models.ModelForm, self).save(commit)

		def save_related():
			for model in self.save_after_self:
				for field in model._meta.fields:
					if isinstance(field, ForeignKey):
						if isinstance(self.instance, field.rel.to):
							setattr(model, field.name, self.instance)
				model.save()

		if commit:
			save_related()
		else:
			self.save_related = save_related

		return self.instance

Sorry about the excessive indentation. Hopefully you’ll also name your class more appropriately, but you can see that I iterate through the save_after_self property, and then look at each model’s field metadata. If the field is a ForeignKey field and set to the instance of the form, I re-set that property.

As a disclaimer, I’m still very new to Django and Python in general. I’ve probably missed something completely obvious or have broken some cardinal rule. Hopefully the Python gods don’t smite me for this.

TIL: The Compact Ternary Operator in PHP

An extremely useful tool in nearly any language is the ternary operator. A completely useless example:

$var = "hello";

echo $var == "hello" ? $var : "goodbye";

Just now, my good friend Zach Badgett, let me know about PHP’s (as of 5.3) so-called ‘compact’ ternary operator. It lets you leave out the first parameter:

echo $var == "hello" ?: "goodbye";

 

It is semi-useful, I suppose. Something I’d like to see is an implementation of something similar to MySQL’s IFNULL. To avoid using undefined variables or indexes, I use this type of snippet often:

echo empty($var) ? "" : $var;

It would be extremely nice if I could cut that down to a single function call, however, even writing a function to do the above has issues. For example, when you pass an object that utilize magic methods, you get all kinds of issues. This happens even when passing-by-reference.

It seems, for now, I’ll have to stick with the expanded ternary operator.

 

Snakes, dingos, Python and Django

Recently, I’ve begun working on an EVE related project using Django. It’s my first real Python project so it is obviously slow work. I will probably end up iterating over the basic structure several times as I gain a more in-depth understanding of Django and especially Python.

Python is an interesting language, to say the least. Leaving out how jarring the switch is from a curly-bracket language like PHP, the syntax is very straight-forward. The use of named parameters in function calls is one of my favorite features of C# 4.0, so it’s refreshing to get to use those again.

Understanding the difference between lists, tuples and sets, and more importantly, when to use each type has been one challenge so far. My basic comprehension is that lists and sets are useful for dynamic collections, with the latter for slightly more efficient membership testing and array-like operations due to the unique value requirement. Tuples, on the other hand, are immutable and it is less clear to me on when they should be used. My best guess is they are generally more efficient than other sequence types at membership testing due to the immutability.

However, I am sure these ideas are completely incorrect and they will change with more experience. I am looking forward to getting familiar with Python. A lot of people rave about it, and I can see some of the allure. If for nothing else, I will expose myself to new methods of problem solving – always a plus.

Operation Benchmarking with PHP

There are many ways to skin a cat, they say. This rings true for nearly anything in life, especially with software development. Not only are there best practices and common design patterns, but a language can offer different ways to accomplish even the simplest of tasks.

Many times, given multiple solutions, I’ve wondered what the fastest or most efficient way is. For example, in PHP, determining if a string begins with a sub-string has at least three obvious solutions: strpos, substr or preg_match. We can obviously apply basic wisdom to these ideas: preg_match is definitely not the fastest; the entire regular expression library must be called. However, there is no substitute for hard evidence.

Below is a very simple (abridged) class I’ve written that allows registering test functions and then executing those tests a number of times while keeping track of the results. It disregards the 10% highest and lowest results for a decently accurate mean.

<?php

class Benchmark {

	// ...

	public function execute() {
		$adjustment = round($this->_length * .1, 0);

		echo "Running " . count($this->_tests) . " tests, {$this->_length} times each...\nThe {$adjustment} highest and lowest results will be disregarded.\n\n";

		foreach ($this->_tests as $name => $test) {
			$results = array();

			for ($x = 0; $x < $this->_length; $x++) {
				$start = time() + microtime();

				call_user_func($test);

				$results[] = round((time() + microtime()) - $start, 10);

			}

			sort($results);

			// remove the lowest and highest 10% (leaving 80% of results)
			for ($x = 0; $x < $adjustment; $x++) {
				array_shift($results);
				array_pop($results);
			}

			$avg = array_sum($results) / count($results);

			echo "For {$name}, out of " . count($results) . " runs, average time was: " . sprintf("%.10f", $avg) . " secs.\n";

			$this->_results[$name] = $avg;
		}

		asort($this->_results);
		reset($this->_results);

		$fastestResult = each($this->_results);

		reset($this->_results);

		echo "\n\nResults:\n";
		printf("%-25s	%-20s	%s\n", "Test Name", "Time", "+ Interval");

		foreach ($this->_results as $name => $result) {
			$interval = $result - $fastestResult["value"];

			printf("%-25s	%-20s	%s\n", $name, sprintf("%.10f", $result), "+" . sprintf("%.10f", $interval));

		}
	}
}

To use the Benchmark type, create a new instance, register tests and call the execute method.

Consider the following snippet to test our original question: (Note that I am using anonymous functions, so you’ll need PHP >= 5.3 for this to work as-is)

$bm = new Benchmark();
$bm->register("substr()", function() {
	$str = "document_test";

	$result = (substr($str, 0, 9) == "document_");
});

$bm->register("strpos()", function() {
	$str = "document_test";

	$result = (strpos("document_", $str) === 0);
});

$bm->register("preg_match()", function() {
	$str = "document_test";

	$result = (preg_match("/^(document_)/", $str) == 1);
});

$bm->execute();

/*
 OUTPUT:

 Running 3 tests, 1000 times each...
 The 100 highest and lowest results will be disregarded.

 For substr(), out of 800 runs, average time was: 0.0000043887 secs.
 For strpos(), out of 800 runs, average time was: 0.0000040096 secs.
 For preg_match(), out of 800 runs, average time was: 0.0000057400 secs.

 Results:
 Test Name                    Time                    + Interval
 strpos()                     0.0000040096            +0.0000000000
 substr()                     0.0000043887            +0.0000003791
 preg_match()                 0.0000057400            +0.0000017304
*/

If you execute this code repeatedly, you should get pretty consistent results (although likely not identical to mine). Using strpos is always marginally faster than the other two, with preg_match being quite a bit slower.

 

Obviously, this is pretty limited in its uses. Connecting to non-local (or even any external) resources is going to invalidate basically every result because you can’t necessarily account for time spent “communicating” but there are lots of other things you can test.

 

Download the complete source here.

Troubles in New Eden

Anyone who knows me or has been around me at any point while intoxicated probably knows I play the space MMO EVE Online. I’ve played the game on and off for going on four years. By on and off I mean for the vast majority of it I was actively subscribed, but for brief periods I would only log in to change skill training.

About eight months ago, I joined up with Reddit’s EVE corporation, Dreddit. Since then, I’ve become decently involved in the happenings of the universe. And there have been a lot of happenings lately.

While avoiding the political in-game aspects, such as the recent fall of the seemingly indestructible Northern Coalition, which are in my opinion much more interesting, there is a lot of talk about CCP, the makers of EVE, and the future of the game itself.

With the release of the Incarna expansion, there has been a huge communal lashing-out against CCP for some of the actions they’ve been taking. An internal CCP newsletter was leaked that discussed, with little regard to the known opinion of the playerbase, so-called Pay2Win mechanics. With the release of Incarna, the Noble eXchange came along, allowing players to transfer real cash-money near directly for in-game items. With this, the entire EVE community exploded into a nasty rage-fest.

Currently, NeX only allows you to purchase vanity items. Perhaps a $70 monocle for your character tickles your interest? The newsletter, however, discussed the ability to purchase “gold ammo” and other exclusive items that would give a player an actual advantage over others. This is a similar model used in many other MMOs, but one of the strongest ‘features’ is the extremely “fair” and level field. As others have said, a 10-hour hero can grief a four year old player. Skillpoints are a measure of ‘How many different things can I do?’ and not at all representative of a player’s skill or ability to inflict damage on others.

Now, I can see both sides of this argument. You can already purchase things indirectly with real money by purchasing a GTC, changing it into a PLEX and selling it for ISK. The problem with using Aurum (the new currency) and NeX is that you would likely be bypassing traditional means of attaining items.

Nearly all of the items on the market are player created. This means that players mined or reprocessed the minerals, researched the blueprints, ran the manufacturing jobs, and moved the goods. There is a huge amount of work that goes into this, and simply bypassing it with AUR is absolutely terrible. It devalues one of the most intriguing parts of EVE- the literally player-driven economy.

 

Arguably, the real issue lies in the fact the CCP have not been communicating with their player-base. There is mass amounts of speculation and “trust issues” by the community towards CCP. EVE is set apart not only by the unique nature of the game, but also in how much people care about EVE and its future. It is not like WoW, where a bunch of angry 30-year-old kids rage in their mom’s basement (though there are plenty of raging 30-year-old kids in their mom’s basement in EVE) because of a raid gone bad. EVE is a true incarnation of an alternate universe that we use to escape reality for a while. It may not be healthy, but it’s our EVE, damnit. If anything were to happen, I’d have to go outside and … do things. The thought alone makes me sick. :fat:

Hello world!

So, here we are again…

I’ve dropped the C# site thing, in favor of WordPress. This will make things much more easy on me. Maybe I’ll even write blog posts. Maybe.

This space for sale. ;)