GSoC 2008 SIP Communicator project: July 2008

Tuesday, July 29, 2008

Activity Log : GSoC Week #14 (21.VII - 28.VII)

A short report about the last week activity:

I did one more check regarding the first INVITE sent in the no-registrar mode; one sample of INVITE which it won't be processed by the destination account is the next one:

INVITE sip:test02@w2.x2.y2.z2 SIP/2.0
Call-ID: 01b6fd21bafe064c41d96f2185baf5e0@0.0.0.0
CSeq: 1 INVITE
From: "test01" ;tag=2e27c678
To: "test02"
Via: SIP/2.0/UDP w1.x1.y1.z1:5060;branch=z9hG4bK8f1f612144a850ce33909f5ae91998e7
Max-Forwards: 70
User-Agent: SIP Communicator 1.0-alpha3-0.build.by.SVN Windows XP
Contact: "test01"
Content-Type: application/sdp
Content-Length: 158

v=0
o=test01 0 0 IN IP4 w1.x1.y1.z1
s=-
c=IN IP4 w1.x1.y1.z1
t=0 0
m=audio 5000 RTP/AVP 97 3 0 110 5 8 4
m=video 5002 RTP/AVP 34 26 31
a=recvonly

(w1.x1.y1.z1 is the IP of the call initiator - test01 account; w2.x2.y2.z2 is the IP of the destination peer - test02 account)

The INVITE is the one contained in the originalRequest field of the inviteTransaction before sending it in the createOutgoingCall method of the OperationSetBasicTelephonySipImpl class. I don't have any experience with SDP to be able to tell if something is wrong in that part, but regarding the rest of the INVITE request all seems to be fine I guess. I'll eventually take another look in the RFC this week to be sure.

Regarding the minor hold issue encountered last week I'll copy paste from a mail I've sent on the main dev list:

The initial tests I've ran were on three versions of SC: the encryption branch, the encryption branch with no-registrar and the main trunk last revision only. I've merged again the encryption branch with the main trunk and redone the test (also without the encryption branch). The problem is the same. However like I said last time it seems it might be related with the registrar, because in no-registrar mode it doesn't appear. The following call function stack caught at the moment where the CALL_ENDED event is received in CallSessionImpl seems to confirm
this:

CallSessionImpl.callStateChanged(CallChangeEvent) line: 2115
CallSipImpl(Call).fireCallChangeEvent(String, Object, Object) line: 232
CallSipImpl.setCallState(CallState) line: 113
CallSipImpl.removeCallParticipant(CallParticipantSipImpl) line: 94
CallSipImpl.participantStateChanged(CallParticipantChangeEvent) line: 197
CallParticipantSipImpl(AbstractCallParticipant).fireCallParticipantChangeEvent(String, Object, Object, String) line: 131
CallParticipantSipImpl(AbstractCallParticipant).fireCallParticipantChangeEvent(String, Object, Object) line: 76
CallParticipantSipImpl.setState(CallParticipantState, String) line: 179
OperationSetBasicTelephonySipImpl.processTimeout(TimeoutEvent) line: 1215
ProtocolProviderServiceSipImpl.processTimeout(TimeoutEvent) line: 1121
EventScanner.deliverEvent(EventWrapper) line: 371
EventScanner.run() line: 492 [local variables unavailable]
Thread.run() line: 619 [local variables unavailable]

The request which causes the TimeoutEvent in the processTimeout function from ProtocolProviderServiceSipImpl is an INVITE which looks like this:

INVITE sip:eontest02@w.x.y.z:5060;transport=udp SIP/2.0
Via: SIP/2.0/UDP 0.0.0.0;branch=z9hG4bK6028fc60c92752a388b13aa6193b1379
CSeq: 1 INVITE
Call-ID: 936d9c68884f95bb0317582128a2b8a5@0.0.0.0
From: "eontest01" ;tag=45db520a
To: "eontest02" ;tag=d3c9c209
User-Agent: SIP Communicator 1.0-alpha3-0.build.by.SVN Windows XP
Max-Forwards: 70
Contact:
Route:
Content-Type: application/sdp
Content-Length: 173

(where w.x.y.z is the registrar's IP)

I hope the debug results to be of some use, but like I said it might be exclusively related with my registrar/call configuration; the problem seems to appear mostly (but not exclusively) only in the case of one of the peers placing the call on hold - the one which resides on the same machine with the registrar (which is also the one where the above INVITE is timed out); when the other one puts the call on hold it seems to work well most of the time.

I also redone the tests last week with SC on Windows-Twinkle on Linux after the previous merge. The results are the same as previous ones, meaning ok, with one exception which is again related to placing the call on hold. This seems to cause the crash of the encrypted audio channel (after resuming, Twinkle doesn't report anymore the channel to be secured and in addition the audio stream is lost being replaced by garbled noise). This happens both cases: if Twinkle client places the call on hold or the SC client does that. Anyway, again this probably is a Twinkle-SC incompatibility regarding the hold feature. To be sure all is fine, I've checked in SC-SC case if the packets are encrypted also after placing the call on hold (by the peer on which I've said it usually works). The check results were that the encryption goes well also after resuming from hold.

So, in what concerns the encryption branch the hold doesn't seem to have any negative impact so far (in SC-SC calls), the problem mentioned last time not being related with the ZRTP integration and possible being caused by the registrar or the entire test configuration.

I've rechecked the CallSessionImpl modifications (with the occasion of the new Hold feature introduction) and thought about restraining the securing only for the audio stream for the moment; However, the idea was based (among others) on the wrong impression that the SAS string is computed twice for the audio and also the video stream, which would have caused problems in displaying. After checking the standard again I've found that I quote: "There is only one SAS value computed per call. That is the SAS value for the first media stream established, which computes the ZRTPSess key, using DH mode." so it seems I had a wrong impression; consequently I've left securing enabled also in case of the video RTP manager too.
I've redone the merge (like I said before) and did also some minor fixes while solving the conflicts. The no-registrar patch for the last merge - current version of the encryption branch can be found at:
http://students.info.uaic.ro/~eonica/sc/no-registrar.patch
(It's just an update of the older version done against the current sources)
I've started thinking on the optional GoClear addition on the current implementation and I've read again the part concerning this from the ZRTP draft, and considered some of the necessary modifications. I've actually done already some minor additions in this direction to the ZRTP4J library, which I didn't commited yet. Like Werner said in his last mail there are however many aspects to be considered, from modifying the current state transitions in the library to enhancing the GUI support. I'll place this for the moment as a secondary priority (being optional), working on it in parallel with other issues - so in case I'll get stuck at some point the time spent with it won't be a problem
I've considered also using ZRTP4J as a jar; there are some minor modifications I think that are necessary to the original structure of the library like removing the SRTP related sources which are already part of the SC code; the rest should be easy I think, integrating it "externally" (not inside the media jar), the same way I've modified the BouncyCastle jar integration; I'll try doing that this week along with more GoClear additions to ZRTP4J and some test related work

Tuesday, July 22, 2008

Activity Log : GSoC Week #13 (14.VII - 20.VII)

Last week I focused on debugging the no-registrar patch. I won't reproduce the entire mail I've sent on the main dev list last Wednesday, because it's quite long and maybe a bit over-detailed. I'll summarize it in fewer lines and add some further observations.

Let's take this as a use/test case:

We have two SC peers.
Peer A has three accounts created: one registrar mode account R1 and two no-registrar mode accounts: NR1 and NR3.
Peer B has two accounts created: one registrar mode account R2 and one no-registrar mode account NR2.

The registrar server is shut down.
Peer A has NR1 logged in.
Peer B has NR2 logged in.

Peer B (NR2) calls Peer A (NR1) - the INVITE is sent on NR1@w.x.y.z where w.x.y.z is Peer A's IP address.

The problem: The INVITE request is processed by the SIP stack instantiated by the R1 - not NR1 - account's Protocol Service Provider (which is a registrar-mode type). The stack is also the one which continues the call handling, which makes the wrong ProtocolProviderService to process any further requests.

What I can tell for sure after the debugging done last week: The reason why R1's ProtocolProviderService's SIP stack is the one which handles the INVITE request, is the fact that in this specific test case the R1 account is the one that's loaded first (which also means that the corresponding Protocol Provider Service is the first one from the three registered in the bundle context). The account loading order might differ. After deleting and re-creating all the accounts, the first account loaded was NR3 for example and the SIP stack/ProtocolProviderService handling the INVITE for NR1 were the one associated with NR3.

I tried also other configurations; all lead to the same conclusion - the first loaded account's ProtocolProviderService registered is the one which handles the requests when operating in no-registrar mode - it doesn't matter if the first loaded account it's registrar mode or not, if it's logged in or not or if the request is addressed to a totally different target.

I couldn't find why this happens. I managed to trace the code back into the JAIN SIP RI sources, up to inside the SipTransactionStack class but this doesn't help me. The stack is of course the one corresponding to the first loaded account. What I need to know is the point where the code gets there - how the stack/ProtocolProviderService is selected to handle the request. The getProviderForAccount and getRegisteredAccounts inside the ProtocolProviderFactorySipImpl don't seem to have anything to do with this, being called only prior to the call initiation, in the account loading phase, and I didn't find any bug there. Some more details and also about another test case also can be found inside the mail I've sent on the dev mailing list.

After a lot of debugging done and no success, I decided to check out the revision which was the initial subject for the original patch by Michael Koch - revision 3252, to see what makes the difference. I applied the original patch, activated the no registrar Protocol Provider Service instantiation, and unfotunately there isn't any difference - the same problem is present.

This happens however when the registrar is off. When the registrar is on the calls routed trough it, addressed to registrar mode accounts, reach their target ProtocolServiceProvider successfully.

Anyway, even if I couldn't find out what causes the problem, I made some "fixes" to the initial patch submitted:

added the OperationSetTypingNotifications to the others in the registrar mode ProtocolProviderService
cut out the REGISTERS_USE_ROUTE property from the AbstractProtocolProviderServiceSipImpl (it was a duplicate - also in the registrar mode provider service)
restricted the entire keep alive related functionality only to registrar mode ProtocolProviderService instantiations (I'm not very sure about this but at least the using of REGISTER for keep alive in non-registrar mode doesn't seem to make sense; though I'm not entirely sure about OPTIONS request - I'm not very experienced with SIP; anyway, it was a newer functionality added since the initial patch and I've done the modification also to exclude a potential source for the bug .. but it wasn't)
added an if branch in the new sayBye method from the OperationsSetBasicTelephonySipImpl for specific cases of registrar or no-registrar mode provider service (again I'm not entirely certain about if that is correct or not)

These, together with other older and more minor modifications can be found in the patch at this address (done this time against the main trunk revision):

http://students.info.uaic.ro/~eonica/sc/current-rev-no-registrar.patch

Note that this patch allows ProtocolProviderService instantiation for registrar and no-registrar mode also. The older patches had the registrar mode option blocked allowing only no-registrar. Even if the mentioned problem was present also at that time, the call because being handled through a no-registrar ProtocolProviderService as it was intended went on fine.

I'm leaving the no-registrar patch for now, because I've already spent a week with the debugging on it and I didn't manage to solve the main bug. I'll post the main things discussed
here on the dev list, maybe someone experienced with SIP and the JAIN stack can give me a hint eventually, and go back until then to my main GSoC project theme.

I've merged the encryption branch with the latest main trunk revision and I solved the conflicts, but I didn't committed it yet. I'll probably do it soon but first I'll report on the main dev list, a minor bug I found. It's about the new Hold functionality.

The bug manifests the next way: When I put a call on hold, after I release it in aproximately 20 seconds the call ends abruptly without any error message. The logger displays the message from the callStateChangedfunction inside the CallSessionImpl class : "Stopping streaming." like a CALL_ENDED event was received. I didn't debug the code further because I tested this also in no-registrar mode and it seems it doesn't happen, so it might be related with the type of registrar used. However I didn't commit the merge to the encryption branch yet, as I said; I wonder if putting the call on Hold doesn't affect somehow the ZRTP integration at some level and I must take a careful look at it (at a first sight the ZRTP exchange shouldn't be affected at least, but the first thought when encountering the bug above was that is related with the RTPConnector usage for handling the media flow in the ZRTP integration branch; anyway it also occurs between two peers built from main trunk when using my registrar so it seems it wasn't that).

For now you can find the merge patch at this address:

http://students.info.uaic.ro/~eonica/sc/last-rev-merge-patch.patch

That's pretty much all for the moment; I'll probably start thinking on the suggested JUnit testing this week.

Tuesday, July 15, 2008

Activity Log : GSoC Week #12 (7.VII - 13.VII)

Here's a quick report for the last week which was used mainly for testing :

Made a basic setup for a Linux (openSUSE 11) partition in order to test SC - Twinkle (available only on Linux) intercommunication using ZRTP (the setup took a bit longer than planned due to some partition losing which needed recovery)
Applied the Twinkle patches sent by Werner in order to build it with the latest libzrtpcpp version
Done SC-Twinkle tests in no-registrar mode: the ZRTP based encryption activated successfully, and even if there were some codec related problems on one audio channel at first, after repeating the tests these dissapeared (G.711a was selected in the successfull case if I recall correctly)
Done SC-Twinkle tests in registrar mode: for registrar I've used this time a very basic configuration of OpenSER on Linux; ZRTP based encryption activated successfully and the calls went fine
Registered two accounts at Free World Dialup in order to test using an external SIP service provider
Done SC-Twinkle tests through FWD: calling Twinkle from SC went fine regarding the ZRTP activation and encryption; however the SC->Twinkle audio channel was "empty" (the selected codec was GSM this time so I'll probably have to retest trying some codec limitations on Twinkle to see what happens); calling SC from Twinkle hanged unfortunately (I didn't investigate yet the problem further)
Done few SC-SC tests through FWD (on Windows platform): all went ok - no problems encountered yet in this case to report (should test some more however)
All tests done included calling with ZRTP activated from start and also with ZRTP activation during the call (for the SC peer)
All problems reported were encountered also in unsecured calls
Werner made an extensive test report also for SC - Minisip/Zfone calls; basically the ZRTP activation and encryption seem to pose no problem; however there were some problems as in my tests regarding general SC communication

This is pretty much all for the testing done until now. I plan to repeat some of them (especially the ones which had problems even if these weren't ZRTP related).

Also I've started at the end of the last week, and went on this week with the debugging of the no-registrar patch. I'm still stuck with the problem of permitting both registrar mode and no-registrar accounts. Essentially this manifests by selecting the wrong (registrar mode account) service provider when two of them, of different types, are present. I've done some debugging on this, didn't found a specific reason yet but managed to finally relate the behaviour with the accounts random generated identifiers. This isn't apparently as strange as it seems at the first sight. Based on the identifiers the order of the accounts at loading seems to differ, and this is the only difference I found so far between calls which go on normally and the ones which fail due to wrong service provider selection. I'll probably post soon a more detailed report on the SC main dev list.

Monday, July 7, 2008

Activity Log : GSoC Week #11 (30.VI - 6.VII)

This week was mainly focused on further no-registrar patch work. Because the activity was pretty much detailed during the week through various mails sent on the public SC main dev mailing list, I'll only make a short summary listed chronologically in the next lines:

found out (after quite a while...) the simple solution for the problem in the no-registrar patch pointed out in the last entry - just make sure the proxy related fields are blank (or at least not automatically filled with the IP part of the SIP address) at the account registration and it solves it
merged the encryption branch with the SC main trunk
re-adapted the adapted patch to fit the modifications; you can find it here: http://students.info.uaic.ro/~eonica/sc/no-registrar-updated.patch ( or as a zip archive - http://students.info.uaic.ro/~eonica/sc/no-registrar-updated.zip )
tracked deeper a ZRTP exchange hang issue in the DH phase for which I received a quick solution from Werner
added support for the no-registrar option inside the SIP account registration wizard in order to eliminate the hardcoded version which causes all accounts to use no-registrar mode; however this brought up some problems inside the older manually adapted patch - one is related with faulty call disconnect in some cases in the no-registrar mode and the other is about the fact that the registrar mode calls in the patched version doesn't seem to work and needs fixing (no analysis done for the last part yet, for more details about the first check the mailing list); anyway, you can find the (highly unstable for now) full patch here: http://students.info.uaic.ro/~eonica/sc/no-registrar-with-accregwiz.patch ( or as a zip archive including only the no-registrar option added to the account through the wizard - which is stable :) here: http://students.info.uaic.ro/~eonica/sc/no-registrar-accregwiz.zip )

That's pretty much about the last week activity in short terms. For this week I'm planning to setup a Linux partition to make further ZRTP tests, and eventually to take a deeper look into the last no-registrar patch version to see if I can fix the problems mentioned.

GSoC 2008 SIP Communicator project