Beyond Coding: Open-Source Disclosure and Startup Growth

By: Landsman, Lin, Rabetti & Zhang

Discussion by Dr. Richard M. Crowley

rcrowley@smu.edu.sg / @prof_rmc / https://rmc.link/

Summary

Paper summary

  1. Starts by linking the posting of GitHub repositories to increasing TVL
  2. Focuses on causality
    • Pairing down to a matched sample; GitHub copilot as a shock; Placebo test
  3. Long run implications
    • Risks (Uniswap as a shock); Cyber security; Token return; New features
  4. Mechanisms
    • Governance; repository characteristics

Noteworthy points

  • Interesting setting: Open source repositories and DeFi
  • Lots of analysis
    • Well executed placebo test
  • A topic that could be of interest to DeFi platforms

What does this paper examine?

The front end:
Disclosure through open source and startup growth

My interpretation:
GitHub releases on DeFi protocol outcomes

  • Points of distinction:
    • Open source is much more important in DeFi than most other startup verticals
    • It is unclear that [all] DeFi platforms are startups in a traditional sense
    • GitHub repositories are not necessarily disclosure in a traditional sense
      • They may be more focused on dissemination, transparency, or governance

Prior literature

  • Yang (2023): There is a positive market reaction to software firms releasing repositories on GitHub
    • Similar result, though less analysis and robustness
  • Awyong (2025): DeFi disclosure through social media increases TVL
    • Consistent result with this study if viewing GitHub repositories as disclosure

Endogeneity

Why do DeFi protocols open source code?

Importantly, the code underlying the DeFi protocol is in most cases open source, so available for anyone to review and contribute to and such practice could constitute a source of democratized innovation by any participant of the ecosystem. OECD, 2022

  1. Review existing code
    • Security
    • Governance
  2. External contribution
    • Security
    • New features
  3. Credibility
  • Why wait?
    • Not yet audited
    • Hiding something

Why do DeFi protocols open source code?

I also asked 2 DeFi experts why a DeFi platform would not have an open source repository.

“Dubious fly by night kind” and “large [protocols] definitely have them” Expert 1

“Usually […] a centralized exchange with some onchain services”Expert 2

Both experts suggest fundamental differences between protocols with our without open source presence.

Empirical tests

Main result

“We start our empirical examination with an event-study framework that treats the launch of a new open-source repository as a quasi-exogenous information shock to the market and developer community”

  • However, the choice to open source is highly endogenous in DeFi
    • Which is a concern for the main result, that GitHub repositories lead to increased TVL
    • It may also explain the large economic effects
      • Per the intro, $+4.3M TVL per repository on an average TVL of $6.1M

Protocols’ incentives

Platforms can open source code as part of a plan to increase TVL

Data

  • What is the sample selection criteria?
    1. “DeFi projects aggregated from industry databases (including DeFi Pulse and DeFi Llama), blockchain developer forums, and venture capital reports”
    2. “Identifiable GitHub organization or repository”
    3. “Launch and development activity” in 2019–2024
    4. “GitHub repository launch should occur after the protocol’s initial launch”
  • Is step 4 too restrictive, since most tests are based on any repository release?
    • How are control projects selected? If step 4 is included, then they are open
  • The number of open-source platforms seems much too low at 10.86%
    • Conti et al. (2025) find 9.3% for all industries and 12.1% for software firms

More details needed

Detailed sample discussion and some discussion of who is in the treatment and control sample would be helpful

Causality approaches

DiD: Closed to Open

  • Interpretation depends on the control sample
  • Currently matching on typical characteristics, but…
    • The choice depends on incentives, not just characteristics

GitHub Copilot

  • “Plausibly exogenous technological shocks that lower the frictions of open-sourcing”
    • But they all have code, since they have a protocol with non-zero TVL
  • Copilot works on closed source code, too
  • It affects code generation, but not necessarily the open-sourcing decision

Uniswap

  • In the writing you conflate two separate transitions:
    • V2 to open source, when the Sushiswap vampire attach occurred
    • V3, which was already open source, to being open for commercial reuse, the event you use in the empirics
  • Note that Uniswap made their code open source for V3 despite the attack on V2
    • This suggests something at play different than “expos[ing] firms to imitation threats”

Alternative interpretation

Real effects: Protocols may simply be experiencing increased TVL and return because they are introducing new features

Cybersecurity

A commonly cited benefit of open sourcing code is to get more eyes on it, helping to identify possible security risks and solve them.

  • You do find that DAOs have an effect consistent with the above expectation
  • In contrast, you find open-sourcing leads to:
    • A 76.8% increase in hack losses
    • A 4.6% increase in the risk of being hacked

Piecing together a story

Taken together, your results suggest that, even if open source leads to more hacks, markets are OK with it, given the increased TVL and the unchanged (increased) token value after 3 (90+) days

  • This makes me wonder about the control group again, since hacking targets are highly endogenous.

Other empirical points

  1. Fixed effects are inconsistent across your tables
    • Protocol vs. Industry fixed effects – protocol is better whenever possible
    • Year vs. Year-month vs. no time FE
  2. Returns: Why is the main return Table univariate instead of multivariate?
    • Why is the focus only long run for it? And conversely, why is TVL not looked at long term?
  3. In general, stars and forks are highly correlated, so they won’t work well as measures of distinct concepts
  4. Launching a new feature (Table 9) is highly endogenous with the choice of open sourcing, since contributors can help create the features

Writing comments

Writing comments

  • Citations seem a bit imbalanced:
    • Gefen et al. (2024) is only tangentially related and yet is cited in 5 different pages.
    • In contrast, Conti et al. (2025) [which is published, not a working paper] is highly relevant and is just in an parenthetical with two other papers
    • Work by others in the DeFi space, e.g., AwYong 2025 (about DeFi disclosure on TVL) is absent
  • The average number of followers mentioned in the text seems incorrect, as \(e^{7.002} \approx 1098\), not “over 7,000”
  • “In terms of productivity, individual developer generate an average of 792 commits before.” \(\rightarrow\) This sentence seems incomplete
  • “As shown in table 7 , the coefficient of our instrument variable” \(\rightarrow\) This should refer to Table A4.
  • As much as I enjoy Schumpeter 1942, the Schumpeter parts in Section 5.1 seem out of place and don’t line up with the actual situation you are looking at.
  • Do the economic effects in the intro match to the empirics? It looks like the empirics’ effects are smaller and more reasonable most of the time
  • You implicitly assume that all DeFi protocols are startups – not all are
    • A lot of the writing is very over the top, such as referring to the Copilot test as robust causal evidence
    • The phrase “copilot-induced” makes it sounds like you identify which protocols’ teams used GitHub Copilot – but you can’t
    • The primary measure in the text, OpenSourceLaunch, has a different name in the tables

Writing comments: AI usage?

  • A lot of the writing is loose or scattered, especially in the front end. You discuss a lot of ideas, but it would be better to focus on those that are the focal point of the paper.
    • As such, it reads a bit like an AI piece… And gets flagged as one too
      • Originality.ai: 100% mean score across 10 randomly selected paragraphs
      • GPTZero: 4 of 5 random paragraphs score “highly confident” for AI generated

Conclusion

Summary

  1. I would really appreciate more consideration of DeFi protocols themselves
    • Their incentives: why they choose open source
      • Or why others didn’t
    • How DeFi protocols are different from startups
  2. More focus
    • You have a lot of tests trying to make a lot of points
    • Instead, you could pare down some of the tests to make fewer, stronger points
  3. More transparency
    • More details on the sample, who is in it, and who is in the control group
    • Discussion on why the sample is so small
      • Is it due to restrictions on sample inclusion, or is GitHub usage really so rare in your sample?

Thanks!


Dr. Richard M. Crowley
rcrowley@smu.edu.sg
@prof_rmc
rmc.link/

Packages used for these slides

  • dplyr
  • ggplot2
  • kableExtra
  • knitr
  • quarto
    • material-icons
  • revealjs
  • scales

References

  • Awyong, Amanda. (2025) “The Role of Disclosure in Decentralized Finance Markets: Evidence from Twitter”. Available on SSRN: https://ssrn.com/abstract=5150613.
  • Conti, Annamaria, Christian Peukert, and Maria Roche. “Beefing IT Up for Your Investor? Engagement with Open Source Communities, Innovation, and Startup Funding: Evidence from GitHub.” Organization Science (2025).
  • Coleman, Braiden, Karson Fronk, and Kristen Valentine. (2025) “Corporate Adoption of an Open Innovation Strategy: Evidence from GitHub.” Available on SSRN: https://ssrn.com/abstract=5068865.
  • OECD (2022), Why Decentralised Finance (DeFi) Matters and the Policy Implications.
  • Yang, Wei. (2023) “How Can Open Source Technology Ecosystem Create Value? Evidence from Investors’ Reactions to Firms’ GitHub Code Releases.” Available at SSRN: https://ssrn.com/abstract=4433433.