Monday, August 18, 2008

Activity Log : GSoC Week #17 (11.VIII - 17.VIII)

This final GSoC week was dedicated for revision (and consequently some modifications) done to the previous work. I'll reproduce the main parts of the already detailed mail sent on the SC dev list in order to summarize the things done. Before that, I must say, that I'll probably continue maintaining this blog, after the end of GSoC too, in case of future ZRTP enhancements, or other news about the integration in SC.

In addition to the revision steps following I must add that for the moment, the GoClear - GoSecure feature presented in the last blog entries is temporarilly disabled mainly due to some standard inconsistencies, that may appear especially in case of transmission errors (for error free calls it is working between SC-SC calls). More about this (and detailed info for how to re-enable it for testing) can be found in one of the mails sent on the SC dev list during last week.

The rest of the revision week activity can be parted in the next sections:
  • I. GUI code restructuring
  • II. Future multistream enhancement issues (and video securing)
  • III. Patches: ZRTP4J as JAR integration and provisionaly no-registrar
  • IV. Tests summarry
  • V. Other revision points

--- I ---

About the code itself, even if it isn't very much added in terms of new classes, and besides the GoClear - GoSecure changes, which meant modifications in many parts of ZRTP4J (temporarily disabled), the rest of the code was pretty convoluted at some points. So, I've done another modification regarding the GUI part, in order to simplify things a bit, and also provide better means of extension for other possible future security solutions. More exactly, I've refactored the ZRTPGUI plugin under the name of SecurityGUI and I've moved all the button related ZRTP actions inside the SCCallback (ZRTP user callback class - in the media.transform.zrtp package). The movement was pretty logical; that class was already handling the changes on the label. Due to this change, the current architecture leaves the former ZRTPGUI plugin (now SecurityGUI) as a simple container for GUI items. These GUI items (now only for ZRTP) can be obtained and used by future various security solutions' GUI handling classes (like SCCallback now for ZRTP). Actually this was the idea from the first place - because of this, being designed also the general security change event, handled in CallSessionImpl, before ZRTP is selected as the default securing algorithm. In conclusion the effect of this GUI code restructuring (which actually isn't pretty much because the only element affected is the secure button) can be resumed for now as:

- as long the secure button is in initial state (command) of "startSecureMode", it has the role of a "general" toggle on/off security button and it's basic send secure change event action is handled in the SecurityGUI plugin;
- in case there is a call session active, setting the security on makes the button being "caught" in the SCCallback class, and after the ZRTP engine initiliazes, to "become" a ZRTP specific security button; this means that the ActionListener for the button is temporarilly set as being the SCCallback and the button will act by switching between ZRTP specific commands until the call is over; then it will go back to the "startSecureMode" command and the status of a general security button

(All this design is in particular very useful, probably among others, for the GoClear part in case will be re-enabled, because the button needs more commands there). I hope, it's clear enough because it's already too much talk about a single (yet complicated...) button :), so I'll get on to the next subject.

--- II ---

I've temporarily disabled also the security for the video RTP manager. This is based mainly on the fact that ZRTP4J doesn't support yet multistream mode, and the way this integration part was implemented until now could cause some unexpected results in case of simultaneous video securing (may probably work, as it is, with some more flags eventually added to the GUI part, but this doesn't comply with the standard which implies using multistream).The disabling is done in CallSessionImpl and can be re-enabled by removing the comments which are pointed out at specific places by "Video securing related code" text.
I actually gave some thought about the multistream extension this week, and some basic opinions about the integration would be:

  • The ZRTPTransformEngine variable used at ZRTPConnector creation in the TransformManager class should be a static variable, instantiated once and passed to all connectors (audio and video - one multistream engine for all streams)
  • The SCCallback (ZRTP user callback class) should follow the same idea as one static variable in the TransformManager class
  • The ZRTPTransformEngine should have an addConnector method instead an setConnector one and manage an array of connectors (one connector per stream)

This is the "binding"/integration part, between ZRTP4J and SC. Further, the connector array management should probably start to relate closely to the standard. I personally intend to continue also after GSoC, when time permits, to further focus on the ZRTP integration enhancement, and I think multistream is the next part I'll consider in more detail.

--- III ---

Another thing I've done this week was to build SC with ZRTP4J included as a JAR. The ZRTP4J JAR was built from the ZRTP4J sources I've worked on, these containing also the (disabled) modifications for GoClear-GoSecure. In order to reduce the JAR's size I've removed the test package, which isn't used anyway in the integration. I didn't commit yet this build version to the encryption branch. For any further development I think it will be more easy for debugging to keep the current branch source structure (with perriodically commits to the main trunk). However, for commiting the GSOC changes into the the main trunk, it might be better to use the JAR version.

A patch which applies to the current encryption branch transforming it into the JAR build version can be found at:

http://students.info.uaic.ro/~eonica/sc/ZRTP4J-jar-integration.patch

The other patch is another update of the provisional no-registrar patch for the current encryption branch, which has the same problems as it did until now, but it also includes a minor fix for the account registration wizard part added to the initial Michael Koch's version. (Related to this patch also, I'll try to add an entry about the problems described a few weeks ago on the issue tracker soon)

This one can be found at:

http://students.info.uaic.ro/~eonica/sc/no-registrar-updated-for-final-gsoc-encryption-branch.patch

--- IV ---

I've tried this weekend to redo some of the tests previously done to have a final GSOC situation. I gathered the ( all successfull :) ) results in a Google spreadsheet at the address:

http://spreadsheets.google.com/pub?key=pISWJIBcSpdFFBkkNNKpvCQ

(it's my first Google spreadsheet so please don't be too critic on how it looks).

I've added also some previous older tests I didn't do anymore, including also the one done by Werner with Minisip/Zfone (hope I got the basic data accurately). Since Free World Dialup seems to switch on fee-based status soon, I've searched for another free SIP provider service. The first one I've found was Free IP Call ( http://www.freeipcall.com/ ... if I remember correctly ) and the secure call test was successfully.

--- V ---

Other things done as a final GSoC revision for ZRTP4J integration:

  • added resource support for all (hope I didn't missed any...) hard-coded strings used for tooltips, label, etc; I've translated these in Romanian also (unfortunately my German is pretty bad :) and the other languages are more or less unknown to me, so somebody else should handle this in case of a mergewith the main trunk);
  • done some more error handling and other fixes;
  • added a few logger entries and some more comments;

Like I also mentioned in the end of the dev list mail, I would like to express my thanks here also to Werner, Romain, Emil and the rest of the SC dev team for guidance, support and for providing an useful learning experience throughout the entire GSoC period.

Monday, August 11, 2008

Activity Log : GSoC Week #16 (4.VIII - 10.VIII)

I'll detail in this blog entry the changes made last week, regarding the GUI part of the ZRTP integration. Part of these are related with the GoClear-GoSecure added feature (about which I got feedback from Werner related to some issues present in the ZRTP protocol - which I actually encountered myself too - and consequently I must say that for the moment even if it's working, it's state is provisional).

One of the main changes done regarding the previous GUI implementation is removing the padlock (secure) button from the SC sources (QuickMenu class) along with the modifications done for access in the SC packages, and moving it in the plugin where the SAS label was already present. The button is moved together with the previous label inside a JPanel which is now the object returned by the getComponent method requested by the implemented PluginComponent interface. This panel, and it's components are obtained in the SCCallback class, which represents the ZRTP GUI callback class, and there are implemented the main modifications on the GUI triggered by various ZRTP actions.

The bundle activator class for the ZRTP GUI plugin implements getter methods for the ResourceManagement service in order to obtain the resources needed for the GUI elements, and also for the Media service in order to trigger the secure change events through the padlock button. You may wonder if this button commands won't actually be more appropriate to be handled more directly from the ZRTP user callback (SCCallback) class, by deriving it from ActionListener, this class after all having the purpose to update the state of the GUI elements. Well, I actually gave a thought before the current approach, and even it may seem a bit convoluted I decided to leave it this way. The main reason is that I thought at the button as a general secure button - not necessarilly a ZRTP related one. This means, that in the future, the current ZRTP GUI plugin, could be extended for other types of securing protocols by maintaining the current GUI elements (which might be common for more than one protocol, at least the button is as main way of toggleing secure mode) and eventually adding other needed elements (and changing the name of the plugin to a more general SecureGUI). The button, consequently maintains the role - as actually specified some time ago when I first designed the GUI part, of toggleing general security on and off for the established call sessions, even if it is placed in the ZRTP GUI plugin. Based on the static security parameter - usingSRTP - a type of securing algorithm is used - now ZRTP is the only available (actually I described all this in the blog entry for week 7, so I won't repeat it here).

Regarding the needs for the provisional addition of GoClear-GoSecure and also the general securing usage of the button, as mentioned in the last paragraph, I decided to establish a basic state model also for the button actions, by associating them specific commands according to the moment it can be pressed.

So, for start the button has associated the "startSecureMode" command - this is the action command specific for the general secure status changing for the CallSessionImpl, having the effect of setting or unsetting the usingSRTP flag; the command associated with the button (and of course the action) remain the same until a call session is started with secure mode on (usingSRTP true) and a securing algorithm usage is initiated

Since the only current secure algorithm solution is represented by ZRTP, further commands for
the button (states) and associated actions (transitions) apply only for ZRTP for now. In case a future securing algorithm will be implemented, specific commands and actions should be added. The current approach practically provides this means of extension (that's why the button starts in a general secure toggle on-off mode, and all may seem a bit complicated).

When a call session starts (or was started - the case of late secure activation), ZRTP is the securing algorithm to be used, and secure mode is set on - the button is toggled - the ZRTPTransformEngine is initialized. This has the effect of calling init from SCCallback (the ZRTP user callback implementation). Init sets the command for the button as "firstZRTPTrigger". You may call this the point where the button "becomes" a ZRTP securing button, by changing it's command (state) from it's previous status of general secure mode toggle button. The action associated with "firstZRTPTrigger" is doing nothing when clicked (actually it should display a forced warning tooltip but for some yet unknown reasons it doesn't seem to work). This is a measure of avoiding crash for the case when only one peer started the call with secure mode on, and the other didn't and will activate it later. So, if the peer will click again the button at this moment attempting to untoggle secure mode this won't be allowed (after all the call is still in unsecure mode). This case is possible only at first ZRTP activation (there's where the command name comes from).

When the other peer toggles secure mode on, or when both peers start the call with secure mode on, after the ZRTP exchange finally takes place with success, when SCCallback calls secureOn, the button command is set (for both peers of course) to "defaultZRTPAction". This command has the same action associated as the general secure mode command "startSecureMode" - changing the secure mode status for CallSessionImpl. This secure mode change event handling will have the effect of generating alternately GoClear and GoSecure requests from the GUI to the state engine when the user toggles the button.

If an user toggles the button off, generating a GoClear request, this will have the effect at the other peer of a SCCallback call to confirmGoClear. In this case the button command (state) is set to "goClearRemoteToggle" and the button is pressed programatically. The action for this case is to switch the secure mode off in the CallSessionImpl the same as usual, but the event sent has (as a new parameter added in the current version) the source set as being remote, not local. This causes the event not to be handled as the user has pressed the button himself, which would normally sent another GoClear request from the GUI to the ZRTP state engine (we don't want that, the other peer initiated the GoClear procedure). Imediatelly after the programatically button press in SCCallback returns the button command state is switched back to "defaultZRTPAction", so a further click on the button will generate a normal GoSecure request.

If an user toggles the button on, generating a GoSecure request, this will be handled in terms of button command (state), and associated action pretty similar with the above case. We got after all the same issue, toggleing the peer button's on programatically but without initiating a GoSecure procedure too at the peer. This button programatically toggleing is done in the showSAS method of SCCallback (probably I should move it to secureOn; I left it there because the SAS is computed only for the first stream in case of more, but if the GoSecure mode will be based on multistream as it should the change is necessary). The programatically toggle on is done with respect to two flags, one indicating that this is not the first call securing (so it is a consequence of a subsequent GoSecure request) and the second telling that the change was generated by peer (this flag is set default as true, and is switched off for itself by the local peer only at a GoClear or GoSecure request and reset back on when the procedure is over).

There is one more command (state) for the button generated by the ZRTP exchange, this being "revertFromAllowClearFailure", and set in the SCCallback when a warning message is received from the state engine, following a GoClear user request which is denied because the peer doesn't support the mode. This causes the button to generate an secure state change event sent to the CallSessionImpl with the source set as revert in order to switch back the previously modified usingSRTP flag, but without further secure change event processing. Also the button is set back as toggled and remains with that command (in this state) set until the end of the call (if the peer doesn't support GoClear at the first attempt of course it won't support it at next attempts either).

Finally, when the call ends the button command (state) is switched back to the general secure mode toggle "startSecureMode".

This is the main detailed description of how the GUI works now - actually mainly the button (the label is change pretty simple). However, like I said, even the implementation works now (if all exchange goes well), according to Werner's feedback, and also with some issues encountered by me also related to the error handling part, GoClear - GoSecure and the related GUI remains after all in a provisional state.

Tuesday, August 5, 2008

Activity Log : GSoC Week #15 (29.VII - 3.VIII)

I've decided last week to go on after all with the GoClear extension even if there were many aspects to be considered about it (like I wrote in the last blog entry). Yesterday, I've finished a basic version of the ZRTP GoClear enhancement. This is a report about the changes done to ZRTP4J and also to the SC sources along with the description of how the GoClear - GoSecure feature is implemented:

Packets added: ZrtpPacketGoClear and ZrtpPacketClearAck simple classes for the GoClear and the ClearAck messages. The first one provides the getter and setter for the clear_hmac field of the packet.

Events added to ZrtpStateClass.EventDataType: ZrtpGoClear to signal GoClear send request to the state engine and ZrtpGoSecure for signaling switching back to secure mode after GoClear.

States added (or with modified handling) to ZrtpStateClass.ZrtpStates: WaitClearAck (was already added to the enumeration but with empty events handling), UnsecuredState (the state in which the engine gets after a successful GoClear transaction), SecureState (added the GoClear related events handling to it); more information about these inside the implementation description which follows

GoClear request:
Let's take a secured call between peer A and peer B as an example. In order to switch to unsecure mode user A clicks the padlock button (displayed as active). This triggers the send of a SecureEvent handled in CallSessionImpl, which finally gets to the ZRTPChangeStatus function's else branch which calls requestGoClear from the ZrtpTransformEngine. The request is forwarded to the ZRTP4J Zrtp class function with the same name. Here a new ZrtpGoClear event is created and sent for processing to the ZrtpStateClass state engine.

The only switch branch for processing the ZrtpGoClear event is the one from the SecureState. In case that the engine finds itself in other state when a ZrtpGoClear event is received this will be handled on the default branch of the state's function switch, meaning it will be treated in a similar way with ZrtpClose. I'm not entirely sure about this part, but it seemed to be somehow logical from the point of view of a user who cancels the call securing by clicking the paddlock button immediately after this is started - the case when the ZrtpGoClear request will reach another state than SecureState. Please note, that the ZrtpGoClear event is, as mentioned, only a request from the user to the state engine and not the event generated by the actual sending of a GoClear packet. However, if the current handling proves not to be appropriate, separate branches can be added for the ZrtpGoClear processing, to all states event handling methods.

GoClear message sending:
This part is contained in the ZrtpGoClear event processing from the SecureState. First, after receiving the ZrtpGoClear request, it is checked if GoClear-GoSecure transition is accepted for the current call. This is based, according to the ZRTP draft section 5.7.2. on the Allow Clear flags sent in the Confirm1 and Confirm2 messages during the first call securing. For this, I've added in the ZrtpPacketConfirm a setter and a getter for the Allow Clear flag. The flag received, in the peer Confirm is obtained in the prepareConfirm2, respectively in the prepareConf2Ack methods (depending on the role). For now the flag sent I've set to be true (in prepareConfirm1 and prepareConfirm2) but this might be probably enhanced with a custom GUI setting option from the user to allow or not allow secure call clearance. Both, the sent and received flag need to be true in order to permit GoClear - this is checked before going forward inside the ZrtpGoClear request processing branch, with a simple AND between the flags, performed by the Zrtp class goClearAccepted new added method. If the check returns true prepareGoClear is called from the Zrtp class. The prepareGoClear method, like the other prepare methods, has the final role to return the GoClear packet, to be sent from the state engine, after computing the clear_hmac. At this point, there might be a possible optimization of preparing the GoClear packet in advance, immediately after having the HMAC keys, or after the call is secured.

After the packet is sent, according to the ZRTP draft, the SRTP stream sending should be stopped. For this part, the afterFirstGoClearSent function from Zrtp class is called from the state engine, which forward calls stopStreaming(true) from ZrtpCallback. I've added this function to the callback interface because there is needed a method at various points in the GoClear-GoSecure interface in order to start and stop the media stream. In case of the ZrtpTransformEngine the implementation of this method sets a holdFlag. This flag is checked in the transform method, and if found to be true null is returned instead of a transformed packet. In this case, the connector's TransformOutputStream class which calls transform, returns the unmodified packet's length (as before) without sending it anymore (practically the packets are dropped). I don't know if this is the best solution for temporarily stopping the SRTP/RTP stream, but this permits to continue sending ZRTP packets "inside" the same stream (which is needed) - the holdFlag check to generate the packet dropping in the transform method from the ZrtpTransformEngine, is done only in case the packet is not a ZRTP packet. Anyway, if someone finds other solutions, there might be a good idea to compare them. After stopping the media stream, the T2 timer is started for GoClear resend as the draft says, and the current engine state is switched to WaitClearAck.

GoClear message receiving:
This is signaled to the state engine as a normal ZrtpPacket event which is processed only in the SecureState, based on the first/last char message comparison, as in the case of other packets. So, after peer B receives the packet from peer A, it generates the response packet ClearAck by calling prepareClearAck from the Zrtp class. This method performs several tasks. It checks the clear_hmac, stops the audio streaming as the other peer did and as the ZRTP draft specifies, clears the SRTP secrets by calling clearSecrets(), deletes the srtpTransformers from the ZrtpTransformEngine by calling srtpSecretsOff in order to switch to normal RTP traffic and returns a pre-created ClearAck packet to the state engine. The clearSecrets function removes the SRTP secrets and related ones as the 5.7.2.1 ZRTP draft chapter states. For now the ones included there are: srtpKeyI, srtpSaltI, srtpKeyR, srtpSaltR, pubKeyBytes, rs1IDr, rs2IDr, auxSecretIDr, rs1IDi, rs2IDi, auxSecretIDi, rs1Valid, rs2Valid, hvi, peerHvi, newRs1 and zrtpSession which is recomputed as a hash of the previous one; I'm not 100% sure if I didn't miss something or if I've added something I shouldn't had - I'll have to analyze more carefully the role of some of the enumerated ones because part of them I've placed there only based on the criteria that anyway there would be recomputed at a GoSecure toggle.

The GoClear message receiving is continued by the state engine with sending the returned ClearAck message. After this, the handleGoClear method is called from Zrtp, which forwards the call to ZrtpCallback (ZrtpTransformEngine) which again forwards it to the user callback which displays an announcement that the peer turned off the secure channel. The ZRTP draft says, I quote: "The endpoint then renders to the user an indication that the media session has switched to clear mode, and waits for confirmation from the user." I implemented this by a displaying a popup only with an OK button. However, I'm not entirely sure that "confirmation" above refers to such a simple announcement like this, or also needs providing the denial option to the user, but the draft doesn't specify exactly what should happen if the user says "no" so I've went on for the first version. After the user confirms by closing the message box and the called functions start returning, the audio stream is resumed as normal RTP from the handleGoClear in the Zrtp class. Finally, switching to UnsecuredState in the state engine concludes this part.

ClearAck message receiving:
While the above actions take place at peer B, peer A was left in WaitClearAck state after sending GoClear. This state can receive two events (besides the default Close/Error branch). First of them is the response - ClearAck - sent by peer B. This is processed as a ZrtpPacket event by the state engine and goes on like this: the GoClear timer is cancelled and after this the goClearOk function is called from the parent Zrtp class. This method calls the next functions: srtpSecretsOff which deletes the srtpTransformers, clearSecrets described above which deletes the SRTP related secrets and stopStreaming(false) which resumes the media stream as RTP from being stopped when sending GoClear. Here also should be done some GUI update (related to SAS display) which I still need to do. Finally the state engine switches to UnsecuredState.

GoClear Timer received:
The other possible event in WaitClearAck state is the Timer event triggered by a delay in receiving ClearAck after sending GoClear. In this case the GoClear is resent. If the timer expires, the clearSecrets function is called to delete the SRTP secrets as specified by the ZRTP draft. I'm not very sure in which state I should leave the peer in case this happens or if the call must be ended. At the first sight it would be logic if not ending the call to leave it in UnsecuredState, but not receiving ClearAck means probably that the other peer didn't get our GoClear so it would continue to try decoding our packets and sending coded stream. Regarding this, the state should probably remain SecureState and the srtpTransformers kept (even if the secrets are deleted, ...which doesn't make pretty much sense anyway but I've left it this way for now - at least you should be able to hear the received stream correctly...). Above all, there is also the fact that at this point the sending stream is stopped, and the draft doesn't say if it should be resumed or not. In conclusion, this timer expiration is a point which probably I must partially reconsider regarding the current implementation.

This is pretty much about the GoClear, but I'm not over yet :) . There is also the GoSecure part which I've tried to implement. (The draft states secure off - on switches should be possible alternatively during a call in case this supports GoClear)

GoSecure request:
Both A and B peers are in UnsecuredState after a successful GoClear exchange. Let's say peer A wants to switch back to secure state after a while. In order to do this the user presses the padlock button again (this should be showing secure off at this moment - padlock in gray - but this GUI part still needs a bit more coding). The way from GUI to the state engine is pretty much the same as in the GoClear case, starting with a SecureEvent handled in the CallSessionImpl and going forward as described in that case until passing a ZrtpGoSecure event finally, to be processed in the event handling function of the UnsecuredState. Here a Commit packet is prepared and sent pretty much the same as in case of sending the Commit packet when a HelloAck is received in the AckSent state. The Commit packet is generated based on the peer's Hello which was saved when the call was first secured as peerHello (in Detect and AckDetected states). The state engine goes into CommitSent state after sending the packet and starting the timer.

Commit receiving in UnsecuredState:
The Commit sent above by peer A, is received by peer B which after the GoClear should also be in UnsecuredState. For this reason the event handling function for this state has also a ZrtpPacket event branch for Commit. An own Commit packet is generated here as in the AckDetected state. I'm not 100% sure this is really necessary but the SRTP secrets related info are deleted at this moment and I left it according to Werner's comment in the AckDetected state: "Parse Hello packet and build an own Commit packet even if the Commit is not send to the peer. We need to do this to check the Hello packet and prepare the shared secret stuff. " After this part things go pretty much like for the Commit received in the WaitCommit state preparing a DHPart1 message, sending it, and switching to WaitDHPart2 state.

From this moment the state engine enters the normal transitions as when securing the call for the first time. The functionality described should cover the most of the GoClear-GoSecure additions. There are also other minor aspects not mentioned but I think this is already enough for a blog entry :). There are still some parts of which I'm not entirely sure and also some which probably can be optimized, as it can be seen during the report. Also, there are still some parts which need to be solved related to the GUI state update. Until now the padlock button was used only to send commands towards the CallSessionImpl - in this case, it must also be set according to the state of the call. There are to ways to try controlling it: one is accessing the UIService from the MediaService and the second is moving it in the plugin I've created for the label - probably using a JPanel to hold them both. I'll try the plugin option because it is the one prefferred after all and if I don't manage to get it working now I'll go for the first one for temporarily usage.

I didn't commit yet the source modifications. I'll probably do it tomorrow after I'll add also some more comments to them (and also fix the GUI part if I manage to do it until then).

Tuesday, July 29, 2008

Activity Log : GSoC Week #14 (21.VII - 28.VII)

A short report about the last week activity:

  • I did one more check regarding the first INVITE sent in the no-registrar mode; one sample of INVITE which it won't be processed by the destination account is the next one:

INVITE sip:test02@w2.x2.y2.z2 SIP/2.0
Call-ID: 01b6fd21bafe064c41d96f2185baf5e0@0.0.0.0
CSeq: 1 INVITE
From: "test01" ;tag=2e27c678
To: "test02"
Via: SIP/2.0/UDP w1.x1.y1.z1:5060;branch=z9hG4bK8f1f612144a850ce33909f5ae91998e7
Max-Forwards: 70
User-Agent: SIP Communicator 1.0-alpha3-0.build.by.SVN Windows XP
Contact: "test01"
Content-Type: application/sdp
Content-Length: 158

v=0
o=test01 0 0 IN IP4 w1.x1.y1.z1
s=-
c=IN IP4 w1.x1.y1.z1
t=0 0
m=audio 5000 RTP/AVP 97 3 0 110 5 8 4
m=video 5002 RTP/AVP 34 26 31
a=recvonly

(w1.x1.y1.z1 is the IP of the call initiator - test01 account; w2.x2.y2.z2 is the IP of the destination peer - test02 account)

The INVITE is the one contained in the originalRequest field of the inviteTransaction before sending it in the createOutgoingCall method of the OperationSetBasicTelephonySipImpl class. I don't have any experience with SDP to be able to tell if something is wrong in that part, but regarding the rest of the INVITE request all seems to be fine I guess. I'll eventually take another look in the RFC this week to be sure.

  • Regarding the minor hold issue encountered last week I'll copy paste from a mail I've sent on the main dev list:

The initial tests I've ran were on three versions of SC: the encryption branch, the encryption branch with no-registrar and the main trunk last revision only. I've merged again the encryption branch with the main trunk and redone the test (also without the encryption branch). The problem is the same. However like I said last time it seems it might be related with the registrar, because in no-registrar mode it doesn't appear. The following call function stack caught at the moment where the CALL_ENDED event is received in CallSessionImpl seems to confirm
this:

CallSessionImpl.callStateChanged(CallChangeEvent) line: 2115
CallSipImpl(Call).fireCallChangeEvent(String, Object, Object) line: 232
CallSipImpl.setCallState(CallState) line: 113
CallSipImpl.removeCallParticipant(CallParticipantSipImpl) line: 94
CallSipImpl.participantStateChanged(CallParticipantChangeEvent) line: 197
CallParticipantSipImpl(AbstractCallParticipant).fireCallParticipantChangeEvent(String, Object, Object, String) line: 131
CallParticipantSipImpl(AbstractCallParticipant).fireCallParticipantChangeEvent(String, Object, Object) line: 76
CallParticipantSipImpl.setState(CallParticipantState, String) line: 179
OperationSetBasicTelephonySipImpl.processTimeout(TimeoutEvent) line: 1215
ProtocolProviderServiceSipImpl.processTimeout(TimeoutEvent) line: 1121
EventScanner.deliverEvent(EventWrapper) line: 371
EventScanner.run() line: 492 [local variables unavailable]
Thread.run() line: 619 [local variables unavailable]

The request which causes the TimeoutEvent in the processTimeout function from ProtocolProviderServiceSipImpl is an INVITE which looks like this:

INVITE sip:eontest02@w.x.y.z:5060;transport=udp SIP/2.0
Via: SIP/2.0/UDP 0.0.0.0;branch=z9hG4bK6028fc60c92752a388b13aa6193b1379
CSeq: 1 INVITE
Call-ID: 936d9c68884f95bb0317582128a2b8a5@0.0.0.0
From: "eontest01" ;tag=45db520a
To: "eontest02" ;tag=d3c9c209
User-Agent: SIP Communicator 1.0-alpha3-0.build.by.SVN Windows XP
Max-Forwards: 70
Contact:
Route:
Content-Type: application/sdp
Content-Length: 173

(where w.x.y.z is the registrar's IP)

I hope the debug results to be of some use, but like I said it might be exclusively related with my registrar/call configuration; the problem seems to appear mostly (but not exclusively) only in the case of one of the peers placing the call on hold - the one which resides on the same machine with the registrar (which is also the one where the above INVITE is timed out); when the other one puts the call on hold it seems to work well most of the time.

I also redone the tests last week with SC on Windows-Twinkle on Linux after the previous merge. The results are the same as previous ones, meaning ok, with one exception which is again related to placing the call on hold. This seems to cause the crash of the encrypted audio channel (after resuming, Twinkle doesn't report anymore the channel to be secured and in addition the audio stream is lost being replaced by garbled noise). This happens both cases: if Twinkle client places the call on hold or the SC client does that. Anyway, again this probably is a Twinkle-SC incompatibility regarding the hold feature. To be sure all is fine, I've checked in SC-SC case if the packets are encrypted also after placing the call on hold (by the peer on which I've said it usually works). The check results were that the encryption goes well also after resuming from hold.

So, in what concerns the encryption branch the hold doesn't seem to have any negative impact so far (in SC-SC calls), the problem mentioned last time not being related with the ZRTP integration and possible being caused by the registrar or the entire test configuration.

  • I've rechecked the CallSessionImpl modifications (with the occasion of the new Hold feature introduction) and thought about restraining the securing only for the audio stream for the moment; However, the idea was based (among others) on the wrong impression that the SAS string is computed twice for the audio and also the video stream, which would have caused problems in displaying. After checking the standard again I've found that I quote: "There is only one SAS value computed per call. That is the SAS value for the first media stream established, which computes the ZRTPSess key, using DH mode." so it seems I had a wrong impression; consequently I've left securing enabled also in case of the video RTP manager too.
  • I've redone the merge (like I said before) and did also some minor fixes while solving the conflicts. The no-registrar patch for the last merge - current version of the encryption branch can be found at:
    http://students.info.uaic.ro/~eonica/sc/no-registrar.patch
    (It's just an update of the older version done against the current sources)
  • I've started thinking on the optional GoClear addition on the current implementation and I've read again the part concerning this from the ZRTP draft, and considered some of the necessary modifications. I've actually done already some minor additions in this direction to the ZRTP4J library, which I didn't commited yet. Like Werner said in his last mail there are however many aspects to be considered, from modifying the current state transitions in the library to enhancing the GUI support. I'll place this for the moment as a secondary priority (being optional), working on it in parallel with other issues - so in case I'll get stuck at some point the time spent with it won't be a problem
  • I've considered also using ZRTP4J as a jar; there are some minor modifications I think that are necessary to the original structure of the library like removing the SRTP related sources which are already part of the SC code; the rest should be easy I think, integrating it "externally" (not inside the media jar), the same way I've modified the BouncyCastle jar integration; I'll try doing that this week along with more GoClear additions to ZRTP4J and some test related work


Tuesday, July 22, 2008

Activity Log : GSoC Week #13 (14.VII - 20.VII)

Last week I focused on debugging the no-registrar patch. I won't reproduce the entire mail I've sent on the main dev list last Wednesday, because it's quite long and maybe a bit over-detailed. I'll summarize it in fewer lines and add some further observations.

Let's take this as a use/test case:

We have two SC peers.
Peer A has three accounts created: one registrar mode account R1 and two no-registrar mode accounts: NR1 and NR3.
Peer B has two accounts created: one registrar mode account R2 and one no-registrar mode account NR2.

The registrar server is shut down.
Peer A has NR1 logged in.
Peer B has NR2 logged in.

Peer B (NR2) calls Peer A (NR1) - the INVITE is sent on NR1@w.x.y.z where w.x.y.z is Peer A's IP address.

The problem: The INVITE request is processed by the SIP stack instantiated by the R1 - not NR1 - account's Protocol Service Provider (which is a registrar-mode type). The stack is also the one which continues the call handling, which makes the wrong ProtocolProviderService to process any further requests.

What I can tell for sure after the debugging done last week: The reason why R1's ProtocolProviderService's SIP stack is the one which handles the INVITE request, is the fact that in this specific test case the R1 account is the one that's loaded first (which also means that the corresponding Protocol Provider Service is the first one from the three registered in the bundle context). The account loading order might differ. After deleting and re-creating all the accounts, the first account loaded was NR3 for example and the SIP stack/ProtocolProviderService handling the INVITE for NR1 were the one associated with NR3.

I tried also other configurations; all lead to the same conclusion - the first loaded account's ProtocolProviderService registered is the one which handles the requests when operating in no-registrar mode - it doesn't matter if the first loaded account it's registrar mode or not, if it's logged in or not or if the request is addressed to a totally different target.

I couldn't find why this happens. I managed to trace the code back into the JAIN SIP RI sources, up to inside the SipTransactionStack class but this doesn't help me. The stack is of course the one corresponding to the first loaded account. What I need to know is the point where the code gets there - how the stack/ProtocolProviderService is selected to handle the request. The getProviderForAccount and getRegisteredAccounts inside the ProtocolProviderFactorySipImpl don't seem to have anything to do with this, being called only prior to the call initiation, in the account loading phase, and I didn't find any bug there. Some more details and also about another test case also can be found inside the mail I've sent on the dev mailing list.

After a lot of debugging done and no success, I decided to check out the revision which was the initial subject for the original patch by Michael Koch - revision 3252, to see what makes the difference. I applied the original patch, activated the no registrar Protocol Provider Service instantiation, and unfotunately there isn't any difference - the same problem is present.

This happens however when the registrar is off. When the registrar is on the calls routed trough it, addressed to registrar mode accounts, reach their target ProtocolServiceProvider successfully.

Anyway, even if I couldn't find out what causes the problem, I made some "fixes" to the initial patch submitted:
  • added the OperationSetTypingNotifications to the others in the registrar mode ProtocolProviderService
  • cut out the REGISTERS_USE_ROUTE property from the AbstractProtocolProviderServiceSipImpl (it was a duplicate - also in the registrar mode provider service)
  • restricted the entire keep alive related functionality only to registrar mode ProtocolProviderService instantiations (I'm not very sure about this but at least the using of REGISTER for keep alive in non-registrar mode doesn't seem to make sense; though I'm not entirely sure about OPTIONS request - I'm not very experienced with SIP; anyway, it was a newer functionality added since the initial patch and I've done the modification also to exclude a potential source for the bug .. but it wasn't)
  • added an if branch in the new sayBye method from the OperationsSetBasicTelephonySipImpl for specific cases of registrar or no-registrar mode provider service (again I'm not entirely certain about if that is correct or not)

These, together with other older and more minor modifications can be found in the patch at this address (done this time against the main trunk revision):

http://students.info.uaic.ro/~eonica/sc/current-rev-no-registrar.patch

Note that this patch allows ProtocolProviderService instantiation for registrar and no-registrar mode also. The older patches had the registrar mode option blocked allowing only no-registrar. Even if the mentioned problem was present also at that time, the call because being handled through a no-registrar ProtocolProviderService as it was intended went on fine.

I'm leaving the no-registrar patch for now, because I've already spent a week with the debugging on it and I didn't manage to solve the main bug. I'll post the main things discussed
here on the dev list, maybe someone experienced with SIP and the JAIN stack can give me a hint eventually, and go back until then to my main GSoC project theme.

I've merged the encryption branch with the latest main trunk revision and I solved the conflicts, but I didn't committed it yet. I'll probably do it soon but first I'll report on the main dev list, a minor bug I found. It's about the new Hold functionality.

The bug manifests the next way: When I put a call on hold, after I release it in aproximately 20 seconds the call ends abruptly without any error message. The logger displays the message from the callStateChangedfunction inside the CallSessionImpl class : "Stopping streaming." like a CALL_ENDED event was received. I didn't debug the code further because I tested this also in no-registrar mode and it seems it doesn't happen, so it might be related with the type of registrar used. However I didn't commit the merge to the encryption branch yet, as I said; I wonder if putting the call on Hold doesn't affect somehow the ZRTP integration at some level and I must take a careful look at it (at a first sight the ZRTP exchange shouldn't be affected at least, but the first thought when encountering the bug above was that is related with the RTPConnector usage for handling the media flow in the ZRTP integration branch; anyway it also occurs between two peers built from main trunk when using my registrar so it seems it wasn't that).

For now you can find the merge patch at this address:

http://students.info.uaic.ro/~eonica/sc/last-rev-merge-patch.patch

That's pretty much all for the moment; I'll probably start thinking on the suggested JUnit testing this week.

Tuesday, July 15, 2008

Activity Log : GSoC Week #12 (7.VII - 13.VII)

Here's a quick report for the last week which was used mainly for testing :
  • Made a basic setup for a Linux (openSUSE 11) partition in order to test SC - Twinkle (available only on Linux) intercommunication using ZRTP (the setup took a bit longer than planned due to some partition losing which needed recovery)
  • Applied the Twinkle patches sent by Werner in order to build it with the latest libzrtpcpp version
  • Done SC-Twinkle tests in no-registrar mode: the ZRTP based encryption activated successfully, and even if there were some codec related problems on one audio channel at first, after repeating the tests these dissapeared (G.711a was selected in the successfull case if I recall correctly)
  • Done SC-Twinkle tests in registrar mode: for registrar I've used this time a very basic configuration of OpenSER on Linux; ZRTP based encryption activated successfully and the calls went fine
  • Registered two accounts at Free World Dialup in order to test using an external SIP service provider
  • Done SC-Twinkle tests through FWD: calling Twinkle from SC went fine regarding the ZRTP activation and encryption; however the SC->Twinkle audio channel was "empty" (the selected codec was GSM this time so I'll probably have to retest trying some codec limitations on Twinkle to see what happens); calling SC from Twinkle hanged unfortunately (I didn't investigate yet the problem further)
  • Done few SC-SC tests through FWD (on Windows platform): all went ok - no problems encountered yet in this case to report (should test some more however)
  • All tests done included calling with ZRTP activated from start and also with ZRTP activation during the call (for the SC peer)
  • All problems reported were encountered also in unsecured calls
  • Werner made an extensive test report also for SC - Minisip/Zfone calls; basically the ZRTP activation and encryption seem to pose no problem; however there were some problems as in my tests regarding general SC communication

This is pretty much all for the testing done until now. I plan to repeat some of them (especially the ones which had problems even if these weren't ZRTP related).

Also I've started at the end of the last week, and went on this week with the debugging of the no-registrar patch. I'm still stuck with the problem of permitting both registrar mode and no-registrar accounts. Essentially this manifests by selecting the wrong (registrar mode account) service provider when two of them, of different types, are present. I've done some debugging on this, didn't found a specific reason yet but managed to finally relate the behaviour with the accounts random generated identifiers. This isn't apparently as strange as it seems at the first sight. Based on the identifiers the order of the accounts at loading seems to differ, and this is the only difference I found so far between calls which go on normally and the ones which fail due to wrong service provider selection. I'll probably post soon a more detailed report on the SC main dev list.

Monday, July 7, 2008

Activity Log : GSoC Week #11 (30.VI - 6.VII)

This week was mainly focused on further no-registrar patch work. Because the activity was pretty much detailed during the week through various mails sent on the public SC main dev mailing list, I'll only make a short summary listed chronologically in the next lines:
  • found out (after quite a while...) the simple solution for the problem in the no-registrar patch pointed out in the last entry - just make sure the proxy related fields are blank (or at least not automatically filled with the IP part of the SIP address) at the account registration and it solves it
  • merged the encryption branch with the SC main trunk
  • re-adapted the adapted patch to fit the modifications; you can find it here: http://students.info.uaic.ro/~eonica/sc/no-registrar-updated.patch ( or as a zip archive - http://students.info.uaic.ro/~eonica/sc/no-registrar-updated.zip )
  • tracked deeper a ZRTP exchange hang issue in the DH phase for which I received a quick solution from Werner
  • added support for the no-registrar option inside the SIP account registration wizard in order to eliminate the hardcoded version which causes all accounts to use no-registrar mode; however this brought up some problems inside the older manually adapted patch - one is related with faulty call disconnect in some cases in the no-registrar mode and the other is about the fact that the registrar mode calls in the patched version doesn't seem to work and needs fixing (no analysis done for the last part yet, for more details about the first check the mailing list); anyway, you can find the (highly unstable for now) full patch here: http://students.info.uaic.ro/~eonica/sc/no-registrar-with-accregwiz.patch ( or as a zip archive including only the no-registrar option added to the account through the wizard - which is stable :) here: http://students.info.uaic.ro/~eonica/sc/no-registrar-accregwiz.zip )

That's pretty much about the last week activity in short terms. For this week I'm planning to setup a Linux partition to make further ZRTP tests, and eventually to take a deeper look into the last no-registrar patch version to see if I can fix the problems mentioned.