Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhelpful mails about CI Job failures #3552

Open
stephan-herrmann opened this issue Jan 12, 2025 · 13 comments
Open

Unhelpful mails about CI Job failures #3552

stephan-herrmann opened this issue Jan 12, 2025 · 13 comments
Labels
bug Something isn't working

Comments

@stephan-herrmann
Copy link
Contributor

I'm frequently getting random mails informing me about a failed CI Job, but I'm unable to correlate this to any activity in a PR.

Mails have a subject like

[stephan-herrmann/eclipse.jdt.core] Run failed: Continuous Integration - issue3328 (d773ff0)

In many cases the CI Job claims failure right when the PR build succeeds.

In the mail body it says

Continuous Integration: Some jobs were not successful

And when I click on the button "View workflow run" I'm taken to a page, where the best way to inspect the cause of failure is to download many megabytes of raw build logs. No summary of the failure of any kind.

@mickaelistria Is this the feature introduced by #1254 ?

Does anyone wait for a CI Job on the fork to succeed before submitting a PR? Why? Do people even know where to look for build success in this scenario?

If there's no significant demand for this, my vote is for removing it.

Otherwise someone should

  • ensure that it produces the same results as the PR build
  • help people to get meaningful information from the result page.
@jukzi jukzi added the bug Something isn't working label Jan 13, 2025
@jukzi
Copy link
Contributor

jukzi commented Jan 14, 2025

+1 from me to remove https://github.com/eclipse-jdt/eclipse.jdt.core/blame/master/.github/workflows/ci.yml "Continuous Integration" if nobody plans to fix it.

@mickaelistria
Copy link
Contributor

@mickaelistria Is this the feature introduced by #1254 ?

Yes, it was/is allowing a way for people who fork JDT to immediately get a CI system working on their fork without further configuration. It used to be very useful to JDT-LS, but nowadays I think we could do without it as all the CI is back into Jenkins.

In many cases the CI Job claims failure right when the PR build succeeds.

That's a problem. I will look at whether this can be fixed so that this ci.yml stop being annoying. If it can be fixed, good; if not, we'll remove the job.
Let's put a target to next Tuesday. If it's not fixed by then, we'll remove. I'll try to start having a look immediately, but I easily get distracted...

@mickaelistria
Copy link
Contributor

I've looked at the failure and downloaded the (14MB) log which concludes with:

2024-12-29T21:18:33.6951153Z [INFO] ------------------------------------------------------------------------
2024-12-29T21:18:33.6951437Z [INFO] BUILD FAILURE
2024-12-29T21:18:33.6951678Z [INFO] ------------------------------------------------------------------------
2024-12-29T21:18:33.6952171Z [INFO] Total time:  11:17 min
2024-12-29T21:18:33.6952737Z [INFO] Finished at: 2024-12-29T21:18:33Z
2024-12-29T21:18:33.6953260Z [INFO] ------------------------------------------------------------------------
2024-12-29T21:18:33.7446759Z [ERROR] Failed to execute goal org.eclipse.tycho:tycho-surefire-plugin:4.0.11-SNAPSHOT:test (default-test) on project org.eclipse.jdt.core.tests.compiler: There are test failures.
2024-12-29T21:18:33.7449080Z [ERROR]
2024-12-29T21:18:33.7450208Z [ERROR] Please refer to /home/runner/work/eclipse.jdt.core/eclipse.jdt.core/org.eclipse.jdt.core.tests.compiler/target/surefire-reports for the individual test results.
2024-12-29T21:18:33.7451437Z [ERROR] -> [Help 1]
2024-12-29T21:18:33.7451755Z [ERROR]
2024-12-29T21:18:33.7452482Z [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
2024-12-29T21:18:33.7453289Z [ERROR] Re-run Maven using the -X switch to enable full debug logging.

And then, moving back to the test reports (hundreds of thousands lines above...)

2024-12-29T21:17:44.3875083Z   ResourceLeakTests>TestCase.runTest:970->testGH3328_2:7436->AbstractRegressionTest.runConformTest:1956->AbstractRegressionTest.runTest:3290->AbstractRegressionTest.runTest:3600->TestCase.assertEquals:240->TestCase.assertStringEquals:265 Unexpected failure.
2024-12-29T21:17:44.3875231Z ----------- Expected ------------
2024-12-29T21:17:44.3875236Z
2024-12-29T21:17:44.3875313Z ------------ but was ------------
2024-12-29T21:17:44.3875376Z ----------\n
2024-12-29T21:17:44.3875514Z 1. WARNING in org\example\ExampleService.java (at line 28)\n
2024-12-29T21:17:44.3875619Z  private RadioChannel createRadioChannel() {\n
2024-12-29T21:17:44.3875692Z                       ^^^^^^^^^^^^^^^^^^^^\n
2024-12-29T21:17:44.3875904Z The method createRadioChannel() from the type ExampleService is never used locally\n
2024-12-29T21:17:44.3875967Z ----------\n
2024-12-29T21:17:44.3876090Z 2. ERROR in org\example\ExampleService.java (at line 32)\n
2024-12-29T21:17:44.3876284Z  public ArrayList<RadioChannel> compilationFails(List<StationNode> nodes) {\n
2024-12-29T21:17:44.3876368Z                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n
2024-12-29T21:17:44.3876580Z This method must return a result of type ArrayList<RadioChannel>\n
2024-12-29T21:17:44.3876652Z ----------\n
2024-12-29T21:17:44.3876659Z
2024-12-29T21:17:44.3876732Z ---------------------- ----------
2024-12-29T21:17:44.3876819Z  expected:<[]> but was:<[----------\n
2024-12-29T21:17:44.3876948Z 1. WARNING in org\example\ExampleService.java (at line 28)\n
2024-12-29T21:17:44.3877052Z  private RadioChannel createRadioChannel() {\n
2024-12-29T21:17:44.3877130Z                       ^^^^^^^^^^^^^^^^^^^^\n
2024-12-29T21:17:44.3877336Z The method createRadioChannel() from the type ExampleService is never used locally\n
2024-12-29T21:17:44.3877400Z ----------\n
2024-12-29T21:17:44.3877533Z 2. ERROR in org\example\ExampleService.java (at line 32)\n
2024-12-29T21:17:44.3877730Z  public ArrayList<RadioChannel> compilationFails(List<StationNode> nodes) {\n
2024-12-29T21:17:44.3877880Z                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n
2024-12-29T21:17:44.3878042Z This method must return a result of type ArrayList<RadioChannel>\n
2024-12-29T21:17:44.3878107Z ----------\n
2024-12-29T21:17:44.3878168Z ]>
2024-12-29T21:17:44.3878181Z
2024-12-29T21:17:44.3878295Z Tests run: 54464, Failures: 8, Errors: 0, Skipped: 0
2024-12-29T21:17:44.3878300Z
2024-12-29T21:17:45.4147481Z [INFO]

So there was an actual test failure here.

Does this failure make any sense? I see this job most usually succeeds so it's not constantly sending false positives, is it?
In this case where you get some failures, is this failure relevant to be reported?
I the answer is yes to both, then the job is fine (not bogus).

One issue is clearly the way it reports that is not so comfortable compared to Jenkins. This could be improved, as we get the "Test reports" step just after, we can do just like we do with Jenkins and add -Dmaven.test.error.ignore=true -Dmaven.test.failure.ignore=true to the ci.yml file so the job continues and the test reports become much easier to analyze,

@stephan-herrmann
Copy link
Contributor Author

Does this failure make any sense? I see this job most usually succeeds so it's not constantly sending false positives, is it?
In this case where you get some failures, is this failure relevant to be reported?

Please don't expect answers to these from me, as I don't have an interest in this job. Observing PR-builds and production builds is enough for me.

@jukzi
Copy link
Contributor

jukzi commented Jan 20, 2025

Does this failure make any sense? I see this job most usually succeeds so it's not constantly sending false positives, is it?

All known random failing test are documented as issue. For this "createRadioChannel" i don't see any known issue: https://github.com/search?q=org%3Aeclipse-jdt+createRadioChannel&type=issues

@iloveeclipse
Copy link
Member

Please switch it off. I haven't seen any benefit of it so far, only spamming inbox, or configure it in the way it doesn't spam on "usual" workflow.

@mickaelistria
Copy link
Contributor

I haven't seen any benefit of it so far,

Maybe you've never worked on some a local fork of JDT that you couldn't turn into a PR for a long time? As mentioned, this job gives free CI configuration for potential contributors in such cases; for people who are committer and access everything through Eclipse infra, it might not be helpful.
Then it's a matter of project priority: how much effort is JDT project (ie JDT committers) ready to make in order to be more helpful and welcoming to potential contributors, and to grow or maintain a sustainable community?

configure it in the way it doesn't spam on "usual" workflow.

That's what I would like to do if I get answer to my previous questions:

Does this failure make any sense? I see this job most usually succeeds so it's not constantly sending false positives, is it?
In this case where you get some failures, is this failure relevant to be reported?

If I get an answer that seems actionable by a fix, I will try to fix it.
If I get an answer that confirms the job is actually unreliable and no fix can be identified, I will remove it.
In other cases, I will leave JDT committers take their responsibility and do whatever they want with the project, including removing the job if it's their priority.

@jukzi
Copy link
Contributor

jukzi commented Jan 20, 2025

I have completely ignored those emails so i can't tell if they ever would have shown any relevant failure. Especially those "Run cancelled" emails feel like junk to me. I would not mind if there is a such a Job but only dislike the emails.

@mickaelistria
Copy link
Contributor

Do you know if those notification emails come from the build triggered against this JDT repo or from your fork?

@jukzi
Copy link
Contributor

jukzi commented Jan 20, 2025

i don't know, but i found that those emails are hated by many: https://github.com/orgs/community/discussions/13015

@jukzi
Copy link
Contributor

jukzi commented Jan 20, 2025

In my present case the job reported a known random error error after merging to master: https://github.com/eclipse-jdt/eclipse.jdt.core/actions/runs/12868975269, so it was this repo.

[eclipse-jdt/eclipse.jdt.core] Run failed: Continuous Integration - master (2c1b10f)

@mickaelistria
Copy link
Contributor

Thanks. The error for this build is

2025-01-20T13:44:52.2900149Z Failures: 
2025-01-20T13:44:52.2900553Z   JavaModelTests>TestCase.runTest:970->testPreProcessingResourceChangedListener01:625 Unexpected event type expected:<1> but was:<0>
2025-01-20T13:44:52.2900558Z 
2025-01-20T13:44:52.2900735Z Tests run: 23194, Failures: 1, Errors: 0, Skipped: 0
2025-01-20T13:44:52.2900739Z 

Is this some expected failure?

@jukzi
Copy link
Contributor

jukzi commented Jan 20, 2025

Its not expected, but a random fail see #3249.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants