Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Active View Maps - LRU Map - Gets Corrupted - Likely Because of Restore View State #3970

Closed
ren-zhijun-oracle opened this issue Jun 18, 2015 · 23 comments

Comments

@ren-zhijun-oracle
Copy link
Contributor

Hi,

I've opened for you an issue on git hub under:
jboss/mojarra#3

In summary, the active View Maps is not thread safe and is not accessed in a thread safe manner leading to LinkedHashMap Corruption.

The problem is described in detail on git hub.
Please review your synchronization whenever you access your LRU maps, in this case the Active View Maps fetched from the Http Session by the contant :
Map<String, Object> viewMaps = (Map<String, Object>) context.getExternalContext().
getSessionMap().get("com.sun.faces.application.view.activeViewMaps");

Whenever you do something like:
viewMaps.get("someID") you run a risk of corrupting the map as it is is not thread safe.

This has lead us to an inifite cycle in the destruction of a JSF server session.
This issue is not only criticla, it is a blocked as it forced a full server restart to get the CPU available again to work due to a dead session.

The issue was detected on 2.1.25 but I've taken a look on github to the 2.1.29 release and the bug seems to be there as well.
Please review as well your synchronization on teh Server Side State holder LRU map, just in case and to be safe of such issues in the future.

Kindest regards,
Nuno.

Environment

windows 7, Oracle Unbreakable Linux 7

Affected Versions

[2.1.25]

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
Reported by sono99

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
sono99 said:
In the environment section I should Have mentioned, the issue happens under glassfish 3.1.2.9, JDK 7 update 71.

Kindest regards.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
sono99 said:
Please consider the checked in changes under:

99sono/mojarra@6728d91

Their is a lot of noise in the changes due to formatter differences between my IDE and yours, I expect.
Regardless, please consider jsf-api the changes done to the UI View Root.

To find the relevant changes just search for synchronized (viewMaps) or synchronized (activeViewMaps) as you use both variable names on different occasions to refer the ACTIVE_VIEW_MAPS.

Kind regards.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
sono99 said:
Hi, I have created a branch 2.1.25.2 and committed again the changes this time not allowing the formatter to disrupt the whole file.

Please consider the changes on commit
99sono/mojarra@16fe396

I await your feedback on the changes.
I perceive this bug to big quite a big one - I would be happy if you could review the overall synchronization of mojarra on other layer or other session shared maps that you might use.

The damaged cause by a corrupted LRU did not only have very bad impact in terms of the application serve health - with a CPU fully committed to run a cleanup cycle of cleaning up beans. There are other ways the issue could have played out depending on what the corrupted Active VIew Maps would look like in memory, as it is horrible to debug .

Kindest regards.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@manfredriem said:
Are you running a commercial version of GF 3.1.2? If so, please file an Oracle support issue. Thanks!

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
sono99 said:
Hi there,

Actually, that has already been done.
I've provided to them also the reference to this issue - the one on git hub and the perceived the fix for the problem.

Kindest regards.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@edburns said:
Thanks for submitting this patch. I'm investigating how to do this so it can be done in 2.1.x without changing the spec. In other news, I'll need you to sign the Oracle Contributor Agreement so I can accept the patch: < http://www.oracle.com/technetwork/community/oca-486395.html >. Once you've filled it out, you can email it to david dot delabassee a t orac l e d o t c o m.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@edburns said:
We don't use git to maintain mojarra, though I respect your ability to use JBoss's mirror. I tried manually extracting the changes from the raw files view from the gist, but when I try to compile I get these errors.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
File: message.txt
Attached By: @edburns

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@edburns said:
Can you please attach a diff?

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
sono99 said:
Hi,
Since I have no permission to upload files here,
I have instead uploaded a git dif .txt to my google drive which you can check / download from:
https://drive.google.com/open?id=0B_dEiNBGUsxqNF9ZRFhUWUdrNzg

I believe, however, that the simplest is to either:
(a) Look the git hub online diff and check the affected code lines using the URL i provided above;
or
(b) Checkout the git brach 2.1.25.2 and copy over the files:
jsf-api/src/main/java/javax/faces/component/UIViewRoot.java
jsf-ri/src/main/java/com/sun/faces/application/view/ViewScopeManager.java

And then, on your local working directory, do a Svn Diff, to get an overall idea of the changed code lines.

Integrating these changes by hand should be the simplest approach, as there are only two files and only a few code lines affected.

Kindest regards.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@edburns said:
Can you try running with this context param set to true:

javax.faces.SERIALIZE_SERVER_STATE

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
sono99 said:
Sorry, can you please clarify:
1. Do you mean the modified mojarra with changes mentioned on the diff?
2. Why? The changes proposed give you issues?

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@edburns said:
I haven't gotten back to trying the patch with your new way of arriving at a diff, so in the meantime, I am asking you to try the context param on an unmodified 2.1.x version to see if it resolves your problem.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
sono99 said:
Hi Ed,

Many thanks for the suggestion.
To try the suggested context param is of course possible.
However, validating that the issue is fixed by it is not as trivial, as the bug we experienced is not deterministic and the way by which we notice the error is when session expiration stops working in the container. To identify if the synchronization bug is fixed by whichver setting you want me to set, would only be verifiable either by:
(a) Noticing that the error happens again and that the setting fixes nothing
(2) by sampling the over an extended period of time, taking heap dumps of a running application, and manual analysis over the sampled dumps and studying the Http session attributes across the live sessions of these heap dumps, in search of a session with a corrupted Map.
Not really quite feasible - option 2.

So the best fix for such a problem, is from the begining, having a synchronization model that logically insures concurrent threads are not manipulating non-thread-safe objects. Which is what the fix proposes aims to do. Up to you to decide if there is a better approach.

In any case, since you recommend this setting, it might be worth while using it.
Can you please elaborate as to why you believe this setting would have any benefit in tacking the issue?
In addition, using this setting, be the drawback of it ?
Would server side state saving become less performant than it is today?
I doubt that this setting has any influence on synchronization - does it imply that by having this setting turned on you use less session shared resources and you are less likely to have concurrency absuses of shared objects, such as the LRU map in question?

Kindest regards.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@edburns said:
Regarding why, I'll defer to Manfred, who recommended it to supply more details if he sees fit. Briefly this setting forces the state saving for each request N for a given view id within a session to be done such that if another request N+1 for that same view id within that session comes in before N is done processing, request N+1 will block waiting for N to complete. Naturally this does have an impact on parallelism.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@edburns said:
Also, sono99, I appreciate your responsiveness and persistence. Have you given some thought to sigining the OCA? I really do need this to consider your patch in earnest.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@manfredriem said:
In short it will make sure that the view state is serialized/deserialized from the session. So no component tree would be shared.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
sono99 said:
Hi There Ed and Manfred,

Concerning OCA, I have already done it.
My sent e-mail dates from "Jun 30 (2 days ago)".

I also thank you for your explanation, the JSF requests belonging to a specific user session are serialized, so that does clarify why the solution would address concurrency issues.
We will test this proposal as a temporary work-around for the underlying issue.
However, we ultimately like to not have to use the setting, if we can avoid it, and fallback to the default JSF configuration.

For now, then, we have two possible work-arounds for this.
(a) The 2.1.25.2 patched and (b) your recommendation for a temporary fix.
The 2.1.25.2 patch is currently being used.

We will go with one of the two options.
Depending on how the testing goes on the performance of the screens with your recommended option, we may revert back the library use the context-param.

As for the final fix, we await that you guys find the best solution for it, as that is the most important - since we now have more than work-around to chose from.

Kindest regards.

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
@manfredriem said:
Closing as duplicate of BugDB #21294994

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
Marked as duplicate on Thursday, August 13th 2015, 1:30:49 pm

@ren-zhijun-oracle
Copy link
Contributor Author

@javaserverfaces Commented
This issue was imported from java.net JIRA JAVASERVERFACES-3966

@ren-zhijun-oracle
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant