Feed on
Posts
Comments

This is the second is a series of blog posts I’m writing on things that Objective-C can learn from Java. The other parts can be found here:
Part 1 (Generics)
Part 2 (Abstract Classes)
Part 3 (Single Source File)
Part 4 (Namespace)

When one is using object oriented design, a common practice is to lump similar classes together with a common super-class and include the common functionality in that super-class. In doing such design, a common problem is for the super-class to require some information that can only be computed by the sub-class. The solution is for the super-class to make a function call on itself which the sub-class implements. For example, I recently designed a class which simplifies storage of an object in a SQL row, but it knows nothing about the actual field names or values stored in the database. In this case, I made a function, which subclasses implement, to retrieve this data. In Java, this is simply done through an abstract method in an abstract class.

Objective-C has no concept of an abstract method or class. Instead, the closest one can get is to provide an empty implementation for a function and then override that implementation in the sub-classes. Unlike Java, there is no error, or even a warning, if the programmer of the subclass forgets to implement the function that must be overridden. The landscape becomes more confusing when one considers properties. Since Objective-C’s gives errors when an implementation doesn’t provide the property accessor methods, it provides a annotation @dynamic which silences the error. Apple mentions this can be used when an implementation is dynamically provided at runtime, but considering the complexity of this technique, it’s more common use is to silence the error in the case where a subclass implements the accessor methods. However, in the inconsistency with which Apple has treated properties, there is no error or warning, or any way to make the compiler throw one, if the sub-class programmer forgets the implementation. This makes @dynamic really dangerous (just as ignoring warnings can be dangerous).

So, what is really required to make abstract classes work in Objective-C? Again, all of the work is in the compiler and doesn’t affect runtime. First, one needs to be able to define a class as abstract, which tells the complier that allocation of that class, such as [AbstractClass alloc], should throw an error. Second, properties and methods need to be able to be defined as abstract. This will silence any errors and warnings about missing implementation in the abstract class and throw the errors and warnings about missing implementation in a non-abstract sub-class. A few changes to the complier, and a programmer can do more advanced designs and avoid simple common mistakes at compile time rather than runtime.

Impact on Runtime:
None, abstract classes and methods are syntax sugar only.

Impact on Code:
Allows more advanced design, and sub-classes become less prone to error.

This is the first is a series of blog posts I’m going to write over the next several days on things that Objective-C can learn from Java. I’ve been programming in Java since 1997, and in Objective-C since 2001. The two languages have a lot of similarities, but there are a few design principles in which Java excels and Objective-C is left behind. This is understandable considering that Objective-C is older than Java, and Java borrowed heavily from Objective-C when it was designed. In this series I’m only going to discuss changes to the language; these items will have very little, if any, impact on the runtime. For the purposes of this discussion, I’m going to use Java’s terminology since it is more familiar with the programming public. This means I’ll talk about functions instead of selectors, and interfaces instead of protocols.

The other parts can be found here:
Part 1 (Generics)
Part 2 (Abstract Classes)
Part 3 (Single Source File)
Part 4 (Namespace)

The first item which Objective-C can learn from Java is Generics. Objective-C is a weakly typed language, too weak for my tastes, whereas Java is strongly typed. There are distinct advantages to a strongly typed language, but I’m not going to get into that here. This means that Objective-C’s can make a function call on an object without knowing its type where as Java must have the object typed to an interface or class which defines that function. This requires typecasting in Java before making function calls, where as Objective-C does not. In some cases this throws a warning in Objective-C, and in some cases does not. Missing warnings can lead to programming mistakes, and programmer ignored warnings leads to bad style. In either case, typecasting will change the mistakes to warnings, and the warnings which are not mistakes to no warnings.

Java had the issue in requiring typecasting to make function calls. This lead to numerous typecasts in code, which added to clutter. Java’s solution to much of this clutter came through the use of generics. One of the largest sets of typecasts resulting in fetches from a collection. For example, if one had a list which contains objects of class Person, then it would be convenient if the complier just remembered that for you. This was done by simply declaring the list as List<Person>. Then, when fetching objects from the List, no typecast is necessary since the compiler already knows that all objects in that list are a Person in the first place. I should add, for the C++ programmers out there, this is not a template. Templates define a whole new set of code for that type, ballooning the object binary, where as generics can be thought of as automatic typecasting.

I have found many arguments that state Objective-C doesn’t need generics because it is a weakly typed language (just Google Objective-C generics to see them). These arguments tend to forget one major piece of the language. Consider the following code:


@interface Person : NSObject {
    NSString   *name;
    NSArray    *siblings;
    Person     *father;
    Person     *mother;
    NSArray    *children;
}

@property (nonatomic, retain) NSString *name;
@property (nonatomic, retain) NSArray *siblings;
@property (nonatomic, retain) Person *mother;
@property (nonatomic, retain) Person *father;
@property (nonatomic, retain) NSArray *siblings;

@end

In this example, I’ve defined an Objective-C class called Person with a name, an array of siblings, a father, mother, and an array of children. Now, say I wanted to get the first-born child’s name. I could do this with [children objectAtIndex:0].name. Except, this produces an error because [children objectAtIndex:0] returns an id, not a Person *, and properties, unlike functions, are strongly typed in Objective-C. So, the recourse is to either use a temporary variable of type Person *, or to typecast on the same line like: ((Person *)[children objectAtIndex:0]).name. If Objective-C had generics, this would eliminate any need for the typecast or temporary variable since the compiler would already know the type of the object returned by the array.

This is a simple example of where generics become useful in the language, but there are far more. I have actually found Objective-C’s lack of generics limiting my ability to design class hierarchies, and requiring a large amount of unnecessary typecasts or in some cases, additional function calls, which produces slower code.

Should Objective-C adopt all of Java’s generics? I would contend the answer is no. Java’s generics can get very convoluted very quickly, and the biggest source of the mess is in defining a generic function call. While they add convenience, generics in function calls becomes very messy. However, generics on a class level (and used in it’s functions) add so much benefit and cleaner code design that Objective-C is really suffering without them. Additionally, Java produces warnings if a class containing generics does not have it’s generic type defined. For example, List children will produce a warning because I didn’t define the type of object contained within list. In these cases, since Objective-C is so weakly typed, it’d be most appropriate if its generics just defaulted to the base class allowed in the generic, which in the case of NSArray would be id.

Impact on Runtime:
None, generics are syntax sugar only.

Impact on Code:
Generics added where needed, changing the tracking of object type to the compiler instead of the programmer. Cleaner code, and less prone to error.

I’ve been slowly transitioning to using nginx as the web front-end in an effort to reduce Apache’s memory usage. In keeping with this task, I’m moving more and more off of Apache. One piece I recently moved was trac, transitioning to using it directly by nginx by running it in fast-cgi mode where as previously it was running as cgi though Apache.

While fast-cgi is faster, it has inherent issues, such as any memory leak can result in ever growing memory usage, which is exactly why Apache has a setting for each child to serve a limited number of requests before exiting. Trac.fcgi has no such directive, and has the equivalent of a large memory leak, a non-expiring cache. While it’s not as bad as a memory leak, which will indefinitely grow instead of reaching a limit, if the cache size is larger than the available memory for trac to use, it’s just as serious. The only solution, without fixing trac’s caching mechanisms, is to restart trac periodically, but during the time trac is restarting, all requests are lost, causing bad gateway errors to the user. Additionally, the restart needs to be done manually. Clearly not an ideal solution.

The ideal solution would be for the trac process to be periodically restarted, but all requests be successfully completed. This is what Apache accomplishes with its children, but trac has no such mechanism or even the support for one. So, I had to build it in myself.

My Solution:
First piece is to create a parent process, which holds the fcgi socket, and restarts a child trac process when it dies. This ensures that all waiting requests will be served by either the old or new process. Such a parent process absolutely must have no memory leaks, and so I created one that has only 1 explicit allocation, and it is executed only once.

The second piece, and one that’s considerably harder, is to make the trac process exit gracefully when it’s memory usage gets too big. The first step was to create a subclass of WSGIServer and override its _mainloopPeriodic to run a periodic check. In this check, I do a memory usage check, and if it’s over 90MB, set itself to exit. The problem is there’s no easy way to figure out the memory usage on linux. There is a function, getrusage, which is supposed to give resource usage information, such as memory, but linux gives all zeros (unlike a proper kernel). The only way to get this is to read the information out of /proc, and parse that data. Since this becomes a more expensive operation, I only conduct the test every 100 times.
After doing this, I was still getting periodic bad gateway errors. It turns out that trac spawns a thread to process the request, and that request hadn’t completed when the process exited, dropping the connection and causing the error. In examining the documentation, Python is supposed to wait for the thread to complete before exit. Since it wasn’t, I put in a mechanism to see if any threads are running before exit. Here lies a big problem with Python. I found out that the thread, while created, hasn’t actually started. Since it hasn’t started, it isn’t running, which is why Python exited. Furthermore, Python’s threading is so brain-dead, there seems to be no way at all to differentiate between a thread which hasn’t started and one that is exited but not freed. This means there is no reliable way to detect if all threads have exited. So instead, in order to work around Python, I created a thread-safe counter to count the number of threads. I increment it when the thread is created (not started), and decrement it when the thread completes. I then only allow the main thread to exit when this counter reaches zero (since the main thread does the allocations, this never lets the process die without starting all threads). Given this glaringly bad threading model, I put in another protection mechanism so that the main thread will exit after 30 seconds even if the count isn’t zero, just in case.

With the above two pieces, trac’s memory usage is limited, and no connections are dropped, in the time between one process deciding to go down and when it actually does, nothing is processing requests. So, the last piece is to make trac signal to the parent process that it has decided to exit, and then have the parent process launch a new trac to take over while the previous is exiting. I did this with USR1 signal, where the parent process sends a HUP to the child (in case someone else sent it the USR1 signal), and start a new child. With these modifications in place, trac has been humming along for nearly a month, being restarted about 2-3 times a day with no issues.

Files:
launcher.c – Requires the environment variable WORKER_PATH to be set to know what process to launch. Best run with something like spawn-fcgi
trac.fcgi – Modified trac.fcgi to incorporate the above mentioned changes, complete with commented code for testing/experimentation.
Enjoy

Earlier I wrote about google’s link redirection. I have finally finished my testing of a Safari extension which kills this behavior. I didn’t want to release this extension until the updating mechanism worked and that is what took me so long. Anyway, here is the the extension. Enjoy, and let me know what you think.

So, my parents were using a Flip Ultra HD to record sermons. This camera has a serious flaw which the company has acknowledged and failed to fix. First, the camera has 8GB of memory which it formats into a FAT-32 filesystem. This filesystem has a well known limitation where it cannot have a file which exceeds 4GB in size. The Flip camera, when recoding in HD, will hit this limitation in about an hour (depending on the motion in the video). The simple solution to this is to simply split the recording into multiple files so as to not cause any issues. Pure Digital Technologies, Inc, unlike their competitors, doesn’t seem to have figured out this simple solution but instead elected to have the camera beep and turn itself off. This completely violates what a consumer would expect out of a camera in that it will continually record until it is out of power or out of memory, or in the old days, tape.

I talked with their live support to confirm this flaw, and amid a bastardized, CPU-hogging chat client, I managed to get confirmation that this flaw is known, and no explanation as to why it has not been fixed aside from “I’m sorry, that’s what the camcorder can handle.”

I later found out that the company is owned by Cisco, and considering my history with their lousy products in the past, I should have already known.

« Newer Posts - Older Posts »